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Abstract 

A  new  technique  for  calculating  known  goodness  of  fit 
statistics  for  the  Normal  distribution  is  investigated. 
Samples  are  generated  for  a  Normal  (0,1)  distribution.  The 
means  of  these  samples  are  calculated  and  the  samples  are 
doubled  by  reflecting  sample  data  points  about  the  individual 
sample  means.  This  reflection  of  data  points  about  the  mean 
is  the  new  technique  for  generating  modified  statistics. 

After  the  sample  is  doubled,  critical  values  are  calculated 
for  these  modified  Kolmogorov- Smirnov ,  Anderson-Darling,  and 
Cramer- von  Mises  statistics.  Critical  values  are  for  the 
original  sample  sizes.  An  extensive  power  study  is  done  to 
test  the  power  of  the  three  new  statistics'  critical  values 
versus  the  power  for  the  same  three  statistics,  calculated 
without  reflection. 

Powers  of  the  new  statistics  are  asymptotically 
slightly  higher  than  the  powers  of  their  non- ref lected 
counterparts,  when  the  distribution  tested  is  also  symmetri¬ 
cal.  The  powers  of  new  statistics  are  substantially  lower 
when  the  distribution  tested  is  non- symmetrical .  The  powers 
are  substantially  higher  for  the  modified  statistics  when 
the  continuous  Uniform  distribution  is  tested. 

Complete  tables  of  critical  values  for  sample  sizes 
n  *  3  through  n  *  60  are  included  for  the  modified 
Kolmogorov- Smirnov  and  Anderson-Darling  statistics. 
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A  NEW  GOODNESS  OF  FIT  TEST  FOR  NORMALITY 
WITH  MEAN  AND  VARIANCE  UNKNOWN 

I .  Introduction 

This  thesis  is  an  investigation  of  a  technique  that 
involves  doubling  samples  by  reflecting  the  sample  data 
points  about  their  arithmetic  mean  before  calculating  good¬ 
ness  of  fit  statistics.  Tables  are  to  be  generated  for  the 
Kolmogorov-Smirnov ,  Anderson-Darling,  and  Cramer- von  Mises 
statistics  using  this  technique.  The  usefulness  of  the 
tables  is  demonstrated  by  a  comprehensive  power  study. 

General  Comments  About 
Goodness  of  Fit 

Goodness  of  Fit- -Definition.  Goodness  of  fit  is 
based  on  the  idea  that  one  can  take  a  set  of  data  and  deter¬ 
mine  how  well  it  corresponds  (or  fits)  with  some  known  dis¬ 
tribution.  "Goodness"  refers  to  the  quality  of  this  fit. 

Typical  Non-parametric  Test.  In  the  area  of  non- 
parametric  statistics,  most  goodness  of  fit  procedures  attempt 
to  establish  a  statistical  test  of  fit  which  relies  on  a  yes/ 
no  decision  rather  than  some  measure  of  "goodness."  The 
typical  test  uses  a  null  hypothesis,  H^:  the  data  are  from 
some  known  continuous  distribution.  The  alternative,  H^,  is 
that  the  data  are  not  from  the  hypothesized  distribution. 
Typically,  the  analyst  is  hoping  to  accept  Hq.  The  purpose 
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of  these  tests  is  to  determine  if  the  data  are  distributed 
similarly  enough  to  the  hypothesized  distribution  to  ascribe 
the  properties  of  the  hypothesized  distribution  to  the  popu¬ 
lation  from  which  the  data  were  taken.  For  example,  if  the 
analyst  has  a  group  of  data  he  thinks  is  distributed  expo¬ 
nentially,  he  could  use  one  of  the  goodness  of  fit  tests  to 
reach  a  statistical  conclusion  about  whether  or  not  the 
population  from  which  the  data  are  drawn  is  exponential. 

For  the  more  common  theoretical  distributions,  tables 
of  critical  values  have  been  derived  for  different  goodness 
of  fit  statistics.  One  of  these  tables  has  been  derived  by 
Lilliefors  for  the  Kolmogorov-Smirnov  (K-S)  statistic  and 
the  normal  distribution  with  the  parameters  estimated  from 
the  sample  (Lilliefors,  1967).  To  use  his  tables,  one  would 
calculate  the  statistic  and  compare  it  with  the  critical 
value.  If  the  calculated  statistic  were  greater  than  the 
critical  value  for  the  desired  o-level  (a  is  the  probability 
that  Hq  is  rejected  when  Hq  is  true) ,  HQ  (that  the  data 
being  tested  are  normally  distributed)  would  be  rejected 
(Lilliefors,  1967). 

Power  Problems.  Since  the  non-parametric  test  in¬ 
volves  a  yes/no  decision  rather  than  some  proportional 
measure  of  goodness,  the  power  of  a  given  statistic  is  very 
important  to  the  analyst.  Power  is  the  probability  of  re¬ 
jecting  Hq  when  is,  in  fact,  true  (Mendenhall  §  Schaeffer, 
1973).  The  power  of  a  given  test  provides  some  measure  of 
the  quality  of  the  statistical  test  itself.  Thus,  the  power 
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is  a  measure  of  the  degree  of  usefulness  of  the  goodness  of 
fit  test.  If  the  power  is  low,  then  one  cannot  state  the 
distribution  of  the  data  with  as  much  confidence  as  if  the 
power  had  been  high. 

One  of  the  problems  with  many  of  the  goodness  of  fit 
statistics  is  that,  with  smaller  sample  sizes  (n  *  10),  they 
are  not  very  powerful.  (Throughout  this  paper,  the  term 
"powerful"  will  be  used  to  mean  "of  or  having  relatively 
high  power.")  This  lack  of  power  is  evident  for  the  normal 
distribution,  in  particular,  even  against  skewed  distribu¬ 
tions  (Green  §  Hegazy,  1976;  Stephens,  1974).  None  of  the 
statistics,  for  which  Green  and  Hegazy  reported  powers,  had 
powers  greater  than  0.5  when  sample  size  was  ten  (Green  § 
Hegazy,  1976) . 

Another  problem  with  goodness  of  fit  tests  is  that 
they  are  more  powerful  against  some  distributions  than  they 
are  against  others  (Lilliefors,  1967;  Stephens,  1974;  Green 
$  Hezagy,  1976).  In  that  sense,  power  study  results  are 
again  useful  to  the  analyst.  For  example,  suppose  Hq  is 
that  some  sample  of  data  is  drawn  from  a  normal  population. 
Suppose  the  calculated  goodness  of  fit  test  statistic  is 
.087.  Suppose  the  critical  value  for  that  statistic  is  .079. 
The  test  statistic  value  is  greater  than  the  critical  value, 
so  Hq  would  be  rejected.  In  that  case,  the  analyst  could 
refer  to  a  power  study  and  perhaps  find  that  for  this  parti¬ 
cular  statistic,  the  power  versus  the  exponential  is  .97. 

He  could  then  state  with  high  confidence  that  the  data  is 
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not  exponential,  but  normal.  From  another  power  study,  he 
might  also  find  that  the  power  versus  the  double  exponential 
is  .36.  Thus,  he  could  not  have  as  much  confidence  in  a 
statement  that  the  data  are  not  from  a  double  exponential, 
but  from  a  normal  population. 

This  research  effort  is  an  investigation  of  a  new 
method  that  will,  hopefully,  provide  more  powerful  goodness 
of  fit  tests  for  three  of  the  common  goodness  of  fit  statis¬ 
tics.  The  new  method  is  the  doubling  of  samples  about  the 
sample  mean  before  calculating  the  statistic.  This  technique 
is  applied  to  calculating  critical  values  for  the  normal 
distribution. 

Three  Test  Statistics.  The  three  test  statistics 
being  used  have  been  tested  for  their  power  when  calculated 
for  the  normal  distribution  (Stephens,  1974;  Green  §  Hegazy, 
1976;  Lilliefors,  1967).  These  previous  tests  suggest  a 
methodology  for  the  power  studies  done  using  the  technique 
being  investigated  here.  The  statistics  which  will  be  used 
are  the  Kolmogorov-Smirnov  (K-S)  statistic  (Massey,  1951), 
the  Anderson-Darling  (A-D)  statistic  (Anderson  §  Darling, 
1954)  ,  and  the  Cramer-von  Mises  (CVM)  statistic  (Anderson 
§  Darling ,  1954) . 

The  statistics  are  discussed  in  greater  detail  in 
Chapter  II,  the  background  chapter  of  this  report.  It  is 
important  to  note  that  all  statistics  in  this  research  are 
calculated  after  estimating  the  mean  and  variance  from  the 
sample  data. 
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Primary  Research  Issue 

The  new  statistical  technique  studied  in  this  re¬ 
search  is  motivated  by  the  work  of  Schuster  (1973;  1975). 
Schuster  suggests  that  samples  of  symmetrical  distributions 
can  be  reflected  about  the  parameter  of  symmetry  to  generate 
a  new  sample  with  identical  parameters.  He  uses  this  con¬ 
cept  to  develop  a  new  statistic  that  uses  two  samples,  the 
original  one  and  the  reflected  one  (Schuster,  1973).  The 
technique  suggested  by  this  author  results  in  a  different 
statistic  than  Schuster's.  However,  the  statistics  prob¬ 
ably  are  not  totally  dissimilar.  Both  Schuster’s  and  this 
author's  techniques  can  be  expected  to  have  similar  charac¬ 
teristics  because  they  both  use  reflection. 

The  New  Technique.  The  logic  of  the  technique  pro¬ 
posed  by  the  author  follows.  If  a  sample  of  some  size,  n 
(e.g.,  n  =  10),  is  taken  from  a  normal  population,  the  actual 
number  of  points  used  to  calculate  the  test  statistic  can  be 
doubled  about  the  arithmetic  mean  of  the  sample  data.  In 
other  words,  rather  than  calculating  the  critical  values  for 
the  normal  distribution  at  n  *  10  with  ten  data  points, 
twenty  actual  points  will  be  used.  The  technique  is  demon¬ 
strated  with  an  example  in  Chapter  II. 

More  Restrictive  Critical  Values.  It  is  felt  that 
the  use  of  this  reflection  technique  will  result  in  the  gen¬ 
eration  of  more  restrictive  critical  values.  Because  the 
critical  values  supposedly  will  be  more  restrictive,  it  is 
possible  that  the  probability  of  rejecting  Hq  when  HA  is 
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true  will  increase.  In  other  words,  the  possibility  that 
the  power  will  be  greater  when  data  points  are  reflected 
about  their  mean  will  be  investigated. 

General  Research  Hypothesis.  The  general  hypothesis 
being  v^ed  to  guide  this  research  can  be  stated  as  follows: 
Hypothesis:  For  the  normal  distribution,  the  K-S,  A-D 

and  CVM  statistics,  modified  by  calculation  after 
doubling  the  sample  by  reflecting  data  points  about  the 
sample  mean,  provide  more  powerful  tests  of  goodness  of 
fit  than  do  the  same  statistics  calculated  without 
reflection. 

While  it  is  hypothesized  that  generally  more  power¬ 
ful  statistics  will  result  from  reflection,  some  implica¬ 
tions  from  Schuster's  work  should  be  considered  since  he 
also  used  reflection.  First,  Schuster  proved  that  for  his 
statistic  better  results  could  only  be  expected  when  alter¬ 
native  distributions  are  also  symmetric  (Schuster,  1973) . 

One  would,  thus,  not  be  surprised  to  find  higher  powers  only 
versus  symmetrical  distributions  for  the  new  statistic. 

Second,  Schuster  only  obtained  better  results  asymptotically 
when  the  parameters  were  estimated  from  the  sample.  In 
other  words,  his  statistic  was  "better"  only  for  larger 
sample  sizes  (Schuster,  1973).  It  should  not  be  surprising 
if  this  is  also  the  case  for  the  new  technique. 

Primary  Purpose.  The  primary  purpose  of  this  thesis 
is  to  test  the  above  hypothesis  and  to  generate  tables  of 
critical  values  for  the  three  previously  mentioned  statistics, 
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modified  by  doubling  the  sample  by  reflection.  While  the 
basic  hypothesis  being  tested  is  presented  in  the  previous 
paragraph,  several  other  techniques  are  to  be  tested  before 
developing  the  computer  programs  for  generation  of  critical 
values.  These  are  briefly  described  in  the  following  para¬ 
graphs.  More  detailed  discussions  are  presented  in  Chapter 
II. 

Bootstrap  Technique 

Continuous  vs.  Discrete.  Prior  critical  value  tables 
have  been  determined  by  calculating  and  ordering  statistics 
for  a  large  number  of  random  samples  from  the  test  distribu¬ 
tion.  If  1000  statistics  are  calculated,  the  critical  value 
for  a  =  .05  is  the  950th  largest  order  statistic.  The  pro¬ 
cess  uses  discrete  values  to  determine  critical  values  for 
continuous  distributions. 

The  bootstrap  technique  developed  by  Efron  (1979) 
and  recently  demonstrated  by  Johnston  (1980)  is  a  method  for 
representing  these  order  statistics  on  a  continuous  spectrum. 
This  is  done  by  plotting  the  values  of  the  order  statistics 
and  representing  the  spaces  between  them  as  piecewise  linear 
functions  (Efron,  1979;  Johnston,  1980). 

Interpolation.  If  the  order  statistics  are  plotted 
versus  a  plotting  position  that  would  represent  each  of  the 
order  statistics  on  a  scale  between  zero  and  one,  it  is  pos¬ 
sible  to  interpolate  for  the  desired  percentile  and,  there¬ 
fore,  extract  a  more  accurate  value.  It  is  also  possible 
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that  by  using  this  technique,  cost  savings  can  be  realized, 
since  fewer  random  deviates  may  have  to  be  generated  in  order 
to  get  consistent  critical  values  at  the  desired  a  levels. 


Plotting  Positions 

As  mentioned  above,  the  bootstrap  technique  requires 
the  use  of  some  plotting  position  to  scale  the  order  statis¬ 
tics  between  zero  and  one.  Three  different  plotting  posi¬ 
tions  are  tested  to  see  if  there  is  any  noticeable  arithmetic 
difference  among  them  with  large  numbers  (n  >  100) .  The 
three  plotting  positions  tested  are  called  the  median  rank, 
a  modified  step  rank,  and  the  average  of  mean  and  mode  ranks. 
These  three  plotting  positions  are  presented  in  detail  in 
Chapter  II. 

If  the  differences  among  the  three  plotting  positions 
are  judged  to  be  minor,  only  one  of  the  positions  will  be 
used.  If  there  are  major  differences,  then  critical  values 
will  be  calculated  using  all  three  positions,  and  only  the 
most  powerful  results  will  be  tabled. 

The  reason  these  positions  are  the  ones  being  tested 
is  that  they  all  have  a  desired  symmetrical  property.  They 
all  provide  symmetry  in  the  following  sense.  Suppose  one  has 
a  graph  with  order  statistics  on  the  horizontal  axis  and 
plotting  position  on  the  vertical  axis.  The  vertical  com¬ 
ponent  of  the  plot  at  the  first  order  statistic  is  identical 
to  the  quantity:  one  minus  the  vertical  component  at  the 
last  order  statistic. 


Presentation  of  Research 

The  report  on  this  thesis  effort  is  presented  in 
five  chapters.  The  first  of  these  is  this  introduction. 
Although  the  introduction  is  meant  to  be  detailed  enough  for 
a  reader  familiar  with  the  research  area.  Chapter  II  is  a 
background  chapter  for  the  use  of  anyone  interested  in  more 
details  about  the  techniques  that  have  been  discussed  in  the 
introduction. 

The  methods  used  to  examine  the  above  techniques  are 
presented  in  Chapter  III.  The  results  of  the  research  des¬ 
cribed  in  Chapter  III  are  presented  in  Chapter  IV.  Chapter 
IV  is  a  discussion  of  what  happened.  Tables  of  critical 
values  and  results  of  power  studies  are  located  in  this 
chapter.  The  final  chapter  consists  of  conclusions  and 
recommendations . 

Primary  Purpose  Reemphasized 

The  primary  purpose  of  this  research  effort  is  to 
test  the  technique  of  reflecting  data  points  about  the  mean 
and  to  create  tables  of  critical  values  of  the  modified  K-S, 
A-D,  and  CVM  statistics  for  the  normal  distribution  using 
that  technique.  Statistics  are  calculated  using  normalized 
data  with  the  mean  and  variance  estimated  from  the  sample. 
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I I .  Background 

In  the  previous  chapter,  the  basic  concepts  and 
techniques  being  studied  in  this  thesis  were  presented. 

This  chapter  explains  some  of  those  techniques  in  greater 
detail.  The  chapter  is  divided  into  five  sections.  These 
include  some  introductory  comments;  a  presentation  of  the 
three  plotting  positions  to  be  examined;  a  discussion  of  the 
K-S,  A-D,  and  CVM  statistics;  further  explanation  of  the 
bootstrap  technique;  an  example  of  doubling  samples  by  re¬ 
flecting  them  about  their  means;  and  a  summary. 

Introductory  Comments 

Purpose.  The  purpose  of  this  chapter  is  to  present 
more  detailed  discussions  of  some  of  the  techniques  mentioned 
in  Chapter  I.  This  chapter  is  meant  to  be  used  as  a  refer¬ 
ence  chapter.  One  familiar  with  the  research  area  might  not 
need  to  read  this  chapter. 

Format .  The  format  is  different  than  that  used  in 
Chapter  I.  The  sequence  is  now  the  order  in  which  the  ideas 
are  studied  in  the  research.  The  following  is  a  list  of  the 
topics  in  the  order  of  discussion: 

a.  Plotting  positions 

b.  Three  statistics 

1.  K-S  (Kolmogorov-Smirnov) 

2.  A-D  (Anderson-Darling) 

3.  CVM  (Cramer- von  Mises) 
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c.  Bootstrap  technique 

d.  Doubling  samples  about  the  mean 


Plotting  Positions 

Why ?  From  Chapter  I,  the  reason  the  plotting  posi¬ 
tion  is  necessary  is  to  provide  a  vertical  plot  scaled  between 
zero  and  one.  A  vertical  plot  is  required  for  each  value  of 
the  order  statistic  represented  on  the  horizontal  axis. 
Consider  drawing  n  samples  and  calculating  the  same  statistic 
for  each  sample.  The  results  would  be  a  set  of  n  statistics. 
When  ordered,  the  set  is  of  n  order  statistics.  Given  the 
set  of  order  statistics,  ,  X^2)»  X(3)>  X(n) »  n 

the  total  number  of  statistics  and  i  is  the  rank  of  a  given 
statistic,  i  =  l,2,3,...,n.  For  example,  the  rank  of  X^ 
is  3,  or  i  *  3.  Letting  the  value  of  order  statistics  be 
represented  by  the  horizontal  axis  and  letting  the  vertical 
axis  be  scaled  between  zero  and  one,  the  plotting  positions 
being  tested  allow  the  statistics  to  represent  points  on  a 
continuous  function. 

For  example,  let  n  =  10  samples.  Suppose  this  re¬ 
sulted  in  the  ten  statistic  values  in  order  (X^)  listed 
below.  If  one  used  the  median  rank  (which  is  defined  later) 
as  the  vertical  plotting  position  (Y  (i)}’  he  would  get  the 
list  as  shown  on  the  next  page.  These  values  are  plotted  in 
Fig.  1.  If  straight  lines  are  drawn  between  the  plotted 
positions,  a  piecewise  linear  continuous  function  results. 

In  the  research,  each  of  the  three  plotting  positions 
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will  be  examined  to  see  if  there  is  much  difference  among 
them  with  values  of  n  greater  than  99. 

The  plotting  positions  are  now  described  in  detail. 
Median  Rank.  The  formula  for  median  rank  is  as 

follows : 

median  rank  =  ^  -  -q  '  j  (1) 

where 

i  ■  rank  of  order  statistic  being  plotted 
n  =  total  number  of  order  statistics 
The  above  formula  is  well  known.  From  the  example,  suppose 
the  statistic  being  plotted  is  X^  *  0.98.  In  this  case, 
where  n  =  10,  the  median  rank  is  as  follows: 

median  rank  =  =  .  452  (2) 


A  property  of  this  plotting  position  worth  noting 
is  that  X^  is  the  same  distance  from  zero  as  X^  is  from 


one.  For  instance  at  n  *  10,  the  median  rank  for  X 


(1) 


.067  and  for  X^^  =  .933.  Let  the  median  rank  of  X^j  be 
defined  as  .  Then,  -  0.0  =  .067  and  1-°  '  YU0)  * 

.067.  This  is  the  desired  symmetry  discussed  in  Chapter  I. 

Modified  Step  Rank.  The  second  ranking  procedure 
discussed  is  the  modified  step  rank.  To  understand  this, 
one  must  first  know  the  step  rank  formula.  The  formula  for 
the  step  rank  is  also  well  known  and  is  as  follows: 

step  rank  =  1-1  (3) 

n 
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The  reason  this  formula  needs  to  be  modified  is  that  it  does 

not  have  the  same  type  of  symmetry  as  that  shown  for  the 

median  rank.  For  example,  again  let  n  =  10,  then  for  i  =  1, 

the  step  rank  is  0.0.  For  i  =  10,  the  step  rank  is  0.9. 

If  Y,..  is  the  step  rank  of  X,..  ,  then  1  -  Y . , =  0.1  and 
(l)  v  (i)  (10) 

Y^^  -  0.0  =  0.0.  The  desired  symmetry  does  not  exist. 

The  desired  symmetry  can  be  obtained  if  the  follow¬ 
ing  modification  is  made: 

modified  step  rank  =  —  (4) 

Let  Y,jj  be  the  modified  step  rank  of  .  Then,  at  n  =  10, 

Y^*  0.5  and  Y^q^  =  0.95.  If  follows  that  Y^j  -  0  =  0.05 
and  1  -  Y(10)  =  Hence,  the  desired  symmetry  exists. 

Average  of  Mode  and  Mean  Ranks.  The  last  plotting 
position  discussed  uses  the  average  of  the  mode  and  mean 
ranks.  The  formulas  for  the  mean  and  mode  ranks  are  also 
well  known.  Three  ranks  are  presented  below- -the  mode  rank, 
the  mean  rank,  and  the  average  of  the  two: 


mean  rank  =  — i— s- 
n  +  1 

(5) 

mode  rank  =  - - J- 

n  -  l 

(6) 

i  +  i 4 

n+1  n- 1 

(7) 

average  =  - — 

The  mode  and  mean  ranks  do  not  have  the  desired  symmetry 
about  zero  and  one.  The  average  of  those  two  ranks  does. 
Though  not  done  here,  this  fact  can  be  easily  demonstrated. 
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Three  Statistics 

This  section  is  a  presentation  of  the  three  statis¬ 
tics  being  studied.  All  statistics  will  be  discussed  as 
they  apply  to  the  normal  distribution.  The  parameters  of 
the  normal  distribution,  u  and  a,  are  unknown  and  will  be 
estimated  for  each  sample  by  their  maximum  likelihood  esti¬ 
mators,  x  and  S  (Mendenhall  §  Scheaffer,  1973),  where 


D  =  max ] F*  (x)  -  SN(x) |  (10) 

where 

the  sample  data  points  are  ordered, 

F*(x)  *  normal  CDF  value  of  a  given  data  point, 
SN(x)  =  sample  cumulative  step  function, 
x  and  S  are  needed  to  find  F*(x).  S^(x)  has  two  values  for 

each  ordered  data  point.  These  values  are  i/n  and  (i-l)/n, 
where  i  is  the  rank  of  the  it^  ordered  data  point  and  n  is 
the  sample  size.  The  following  is  an  example  of  how  to 
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calculate  the  K-S  statistic  for  a  given  sample: 


X 

SN(x) 

F*(x) 

1 F* (x) 

*sN(x)  1 

0.2 

0.000 

0.125 

0.1038 

.  1038 

.  0212 

1.6 

.  125 

.250 

.2033 

.0783 

.0467 

2.1 

.250 

.  375 

.  2483 

.0017 

.1267 

3.0 

.  375 

.  500 

.  3446 

.0304 

.1554 

4.8 

.500 

.  625 

.  5596 

.0596 

.0654 

5.0 

.625 

.750 

.  5871 

.0379 

.1629 

8.1 

.750 

.875 

.  8790 

.1290 

.0040 

9.6 

.875 

1.000 

.9484 

.0734 

.0516 

D  =  max|F*(x)  -  S^(x) |  =  .1629 
x  =  4.3,  S  =  3.249,  n  =  8 


Anderson-Darling  (A-D)  Statistic.  A  common  notation 

for  the  A-D  statistic  is  W2  (Anderson  6  Darling,  1954).  Let 

X...  <  X,,,  <  X,,,  <  ...  <  X,  ,  be  n  observations  from  the 

(1)  -  (2)  -  (3)  -  -  (n) 

sample  in  order.  Let  =  F(X^)  =  the  normal  CDF  value 

with  x  and  S  as  estimators  of  y  and  a.  Then,  the  A-D  statis¬ 


tic  (Anderson  §  Darling,  1954)  is 


W2  =  -n  -  I 
n 


n 


Z  ( 2 j -1) [In  u.  +  ln( 1  - 

j-1  3 


Vj  +  l” 


(ID 


Letting  A  =  In  u^  and  B  ■  ln(l  -  un.j  +  i)»  the  following  is 
a  numerical  example  using  the  same  data  points  as  the  K-S 
sample : 


j 

X 

F(x)  =  u. 

Un- j+1 

A 

B 

(2 j - 1)  ( A 

1 

0.2 

.1038 

.  9484 

*-2.265 

-2.964 

-  5.229 

2 

1.6 

.2033 

.  8790 

-1.593 

-2.112 

-11.115 

3 

2.1 

.  2483 

.  5871 

-1.393 

-  .885 

-11.390 

4 

3.0 

.3446 

.  5596 

-1.065 

-  .820 

-13.195 

16 


B 


j 

X 

F(x)  =  Uj 

un - j  + 1 

A 

B 

(  2 j  - 1)  (A 

5 

4.8 

.5596 

.  3446 

-  .581 

-  .423 

-  9.036 

6 

5.0 

.  .5871 

.  2483 

-  .533 

-  .285 

-  8.998 

7 

8.1 

.8790 

.2033 

-  .129 

-  .227 

-  4.628 

8 

9.6 

.9484 

.  1038 

-  .053 

-.110 

-  2.445 

Z  =  -66.036 

A-D  =  W2  =  -8  -  (1/8) (-66.036) 

=  -8  -  8.2545 
=  .2545 


Cramer- von  Mises  (CVM)  Statistic.  The  Cramer-von 
Mises  statistic  (Anderson  6  Darling,  1954)  is  the  third  to 
be  studied  in  this  research. 

Let  n  =  sample  size, 

u^  *  F(X^-j)  =  CDF  value  for  normal  distribution. 


and 


then 


X(l)  -  X(2)  1  X(3)  - 
order. 


be  n  observations  in 


CVM 


ra  *  j=iIui 


(12) 


The  following  is  a  numerical  example  of  calculation  of  the 
CVM  statistic: 


j 

X 

F(X)  =  u. 

A  =  ( 2  j  -  1)  /2n 

(uj  -  A) 

1 

0.2 

.1038 

.0625 

.00171 

2 

1.6 

.2033 

.1875 

.00025 

3 

2.1 

.2483 

.3125 

.00412 

4 

3.0 

.  3446 

.4375 

.00863 
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j 

X 

F(X)  =  Uj 

A  =  (2j-l)/2n 

(Uj  -  A) 

5 

4.8 

.5596 

.5625 

.00001 

6 

5.0 

.  5871 

.6875 

. 01008 

7 

8.1 

.8790 

.8125 

.00442 

8 

9.6 

.9484 

.9375 

.00012 

E 

=  .02934 

CVM 

=  1/ (12) (8) 

+  .02934 

=  .01042  + 

.02934 

=  .03976 

Bootstrap  Technique 

The  bootstrap  technique  is  used  in  this  thesis  as  it 
was  demonstrated  by  Johnston  (1980).  One  of  the  three  plot¬ 
ting  positions  tested  will  be  used  to  represent  the  vertical 
axis  from  zero  to  one.  The  value  of  the  n  test  statistics 
will  be  the  horizontal  components.  Lines  between  the  plots 
will  be  interpolated,  as  was  demonstrated  in  Fig.  1. 

Extrapolation.  In  addition  to  the  interpolations, 
extrapolations  are  necessary  to  find  values  for  and 

X(n+1)  ’  w^ere  is  the  i^  order  statistic,  i  =  0,1,2, 

...,n,n+l.  If  Y^  represents  the  vertical  rank  determined 
by  one  of  the  ranking  procedures,  Y^  is  greater  than  zero 

and  Y,  ,  is  less  than  one.  Since  a  vertical  scale  from  zero 
(n) 

to  one  is  desirable  in  order  to  find  critical  values  for  any 


(0) 


level  of  significance  between  zero  and  one,  values  of  X 

and  X,„ must  be  found  for  Y,n.  =  0  and  Yr  ...  =  1. 

(n+l)  (0)  (n+1) 

To  find  ,  the  slope  of  the  line  between 
and  is  determined.  That  line  is  then  extrapolated  to 
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its  intercept  with  the  x-axis.  If  the  intercept  is  greater 
than  or  equal  to  zero,  then  X^  equals  the  intercept  value. 

If  the  intercept  is  less  than  zero,  then  =  0.  (Since 

all  of  the  statistics  being  tested  yield  non-negative  values, 
X^  cannot  be  allowed  to  be  negative.)  The  line  between 
X^0j  and  X^  is,  then,  interpolated. 

To  find  ,  the  same  technique  is  used,  except 

negative  values  are  not  a  problem.  The  line  between  X^  ^ 

and  X.  .  is  formed.  That  line  is  then  extended  to  its  inter- 
(n) 

cept  with  the  line  Y^  =  1*  The  intercept  value  is  the  value 
for  x(n+l) ' 

Figure  2  is  a  display  of  the  above  three  situations. 
Graph  (a)  depicts  the  situation  where  the  x- intercept  is  less 
than  zero.  In  that  case,  the  solid  line  is  the  line  from 
(Xj-qj.Y^)  to  (X^,Y^).  Graph  (b)  is  the  case  in  which 
the  x- intercept  is  greater  than  or  equal  to  zero.  Graph  (c) 
of  Fig.  2  represents  finding  X^n+^ . 

Finding  the  Critical  Value.  To  find  a  critical 
value,  all  that  is  necessary,  graphically,  is  to  find  1  -  a 
on  the  vertical  axis  and  extend  along  the  line,  Y^  =  1  -  a, 
to  intercept  the  plotted  function.  The  value  of  the  horizon¬ 
tal  component  is  the  critical  value  of  the  statistic  at 
significance  level  a. 

Finding  the  critical  value  with  a  computer  requires 
finding  the  largest  Y^  that  is  less  than  1  -  a.  Suppose 
that  Y^  is  the  largest  rank.  Then,  the  standard  linear 
slope- intercept  formula  (y  =  mx  +  b)  is  used  to  find  the 
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critical  value.  The  change  in 
Y(k+1)‘  Similarly,  the  change 
and  •  After  finding  the 

one  can  then  let  y  equal  1  -  a 
critical  value. 

Example  of  Technique, 
suppose  ten  samples  are  taken, 
the  ten  statistics  calculated: 


y  can  be  found  using  Y^)  and 


in  x  can  be  found  using  X 


00 


constant,  b,  at  (X^  .Y^ )  , 
in  order  to  find  x,  the 


As  in  the  example  in  Fig.  1, 
Let  the  following  numbers  be 


i  Modified  Step  Rank  Statistics  (X^) 


1 

.05 

.22 

2 

.  15 

.41 

3 

.25 

.42 

4 

.35 

.67 

5 

.45 

.98 

6 

.55 

1.02 

7 

.65 

1.03 

8 

.75 

1.08 

9 

.85 

1.12 

10 

.95 

1.13 

In  Fig.  3,  the  statistics  are  plotted  versus  their  modified 
step  ranks.  From  the  above  list. 


Y 

Y 
X 
X 


Cl) 

(2) 

CD 

(2) 


0.05 

0.15 

0.22 

0.41 


Using  the  equation,  y  ■  mx  +  b, 

m  -  slope  -  Ilil  l-Im 
l 2)  *  *(1) 


.15  -  .05 

it”',  tl 


0.52 
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Fig  3.  Example  of  Bootstrap  Technique 


b  -  Y(1)  -  »X(1) 
b  -  .05  -  (.52) (.22)  =  .065 


.  _  0.0  -  b  _  .065 
x  — -  TTT 


.125 


x  *  .125  >_  0,  -  x.  Again,  if  x  had  been  less  than 

would  have  been  set  equal  to  zero. 

Extrapolation  for  is  performed  the  same  way. 
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are  used  to  find  the  slope.  The 


Y (10)  ’  Y(9) *  XC 10) *  and  X(9) 
constant,  b,  is  calculated  at  either  (X(io)  ,Y(10) ^  or  at 

(X(9),Y(g)).  Then,  =  [(1.0-b)/m],  where  m  is  the  slope. 

Now  that  the  function  is  continuous  (by  extrapolation) 
on  the  interval  (0,1),  the  critical  values  can  be  found.  At 
a  *  .10,  previous  studies  (Lilliefors,  1967;  Green  6  Hegazy, 
1976;  Anderson  5  Darling,  1954;  Massey,  1951)  would  have 
picked  1.12,  or  the  ninth  largest  statistic  as  the  critical 
value.  Using  the  bootstrap  method,  the  value  is  1.125  (if 
modified  step  ranks  are  used) . 

To  get  the  critical  value  using  the  bootstrap  tech¬ 
nique,  the  largest  Y ,  ^  less  than  or  equal  to  .90  is  found. 

In  this  case,  this  is  Y^  =  .85.  Therefore,  k  =  9  and 
k  +  1  *  10.  Then, 


As  one  can  see,  the  critical  value  will  vary  with 
statistics  calculated  for  random  samples.  One  of  the  issues 
of  this  research  is  the  number  of  samples  needed  to  get  con¬ 
sistent  results. 


Doubling  Samples  About  the  Mean 

The  following  is  a  description  of  the  technique  of 
doubling  samples  about  their  means.  First,  a  sample  of 
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Fig  4.  Doubling  Samples  About  the  Mean 

random  deviates  is  collected.  Second,  the  arithmetic  mean 
is  calculated. 

The  third  step  has  several  sub-steps.  Let  i  *  1,2, 

3»...,n,  and  n  be  the  number  of  random  deviates  in  a  sample. 

Then,  the  new  deviate  (created  by  reflection  about  the  mean) 

is  Xji+i  3  2x  -  x^.  Looking  at  Fig.  4,  suppose  x^  *  2.4 

and  the  mean  of  all  the  x.'s  is  x  =  3.4.  Then,  x 

l  *  n+x 

*  2(3.4)  -  2.4  =  4.4.  Notice  that  both  points  are  equidis¬ 
tant  from  the  mean.  The  mean  from  the  newly  created  sample 
is  the  same  as  the  original  one. 

Example .  An  example  is  presented  in  Table  I.  The 
first  column  is  the  five  data  points  in  the  original  sample. 
The  second  column  is  of  the  left-hand  sides  of  five  equations 
representing  2x  -  xi  for  each  data  point.  The  third 
column  is  the  reflected  data  point. 


TABLE  I 


1 


Reflection  of  Data  Points  About  the  Mean 


Data  Points 
(n  =  5) 

2x  -  x . 
i 

Reflected  Data 

Point 

0.2 

2(3.4) -  0.2  = 

6.6 

1.6 

2(3.4)-  1.6  = 

5.2 

2.1 

2(3.4)-  2.1  = 

4.7 

5.0 

2(3.4)-  5.0  = 

1.8 

8.1 

2(3.4)-  8.1  = 

-1.3 

Before  reflection 

After  reflection: 

:  x  =  3.4 

x  =  3.4 

Summary 

This  chapter  is  a  set  of  detailed  discussions  of 
techniques  referred  to  in  Chapter  I.  The  techniques  discussed 
are  plotting  positions  (ranking  techniques) ,  the  three  statis¬ 
tics  studied,  the  bootstrap  technique,  and  the  procedure  of 
doubling  samples  about  their  arithmetic  mean.  Specific 
references  will  be  made  to  this  chapter  in  the  following 
chapter  on  procedure. 


III.  Procedure 

The  techniques  to  be  used  in  the  experimental  pro¬ 
cedure  have  been  presented  in  detail  in  Chapters  I  and  II. 
This  chapter  is  a  discussion  of  how  those  techniques  are  to 
be  applied.  Since  all  of  the  data  are  generated  using 
Monte  Carlo  simulation  of  pseudo- random  deviates,  this  is 
essentially  a  chapter  about  how  the  previously  discussed 
techniques  are  combined  into  computer  programs  to  generate 
and  manipulate  Monte  Carlo  data  for  testing  the  research 
hypothesis  presented  in  Chapter  I. 

This  chapter  has  four  major  sections.  The  first  is 
about  how  the  three  plotting  positions  are  to  be  tested. 

The  second  concerns  the  calculation  of  statistics  and  their 
critical  values.  The  third  section  is  a  discussion  of  the 
generation  of  tables  of  critical  values.  In  the  last  sec¬ 
tion,  the  construction  of  the  power  study  is  presented. 

Plotting  Positions 

The  purpose  of  the  first  phase  of  research  is  to 
compare  three  plotting  positions.  The  search  is  for  meaning¬ 
ful  differences  among  the  median  rank  (M) ,  modified  step  rank 
(MS) ,  and  the  average  of  the  mean  and  mode  ranks  (AMMj  at 
various  values  of  n  (n  is  the  number  of  statistics  to  be 
plotted).  If  there  are  meaningful  arithmetic  differences, 
all  three  will  be  used.  If  no  meaningful  differences  exist. 
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only  the  modified  step  will  be  used  for  simplicity.  "Meaning¬ 
ful"  is  an  intentionally  loose  term.  The  researcher  cannot 
judge  whether  the  differences  are  important,  or  "meaningful", 
until  he  has  seen  what  the  differences  actually  are.  The 
plotting  positions,  themselves,  are  thoroughly  discussed  in 
Chapter  II. 

The  Computer  Program.  Since  visual  comparisons  of 
plotting  positions  for  each  value  of  i  (i  =  l,2,3,...,n)  are 
desired,  the  comparison  is  done  via  computer.  The  program 
used  is  simple  and  is  included  in  Appendix  A.  The  program 
has  the  following  three  major  steps: 

1.  For  some  n,  find  the  value  for  each  plotting 
position  at  every  i,  i  =  1,2,3,. . . ,n 

2.  Find  the  differences  among  the  three  plotting 
positions  at  each  value  of  i. 

3.  Print  out  for  every  value  of  i: 

a.  the  values  of  the  three  positions 

b.  the  absolute  value  of: 

1.  M  -  MS 

2.  M  -  AMM 

3.  AMM  -  MS 

The  program  is  run  for  n  =  100,  n  =  150,  and  n  =  300  statistics. 

Calculation  of  Statistics  and 
Critical  Values' 

The  calculation  of  statistics  for  random  samples  is 
at  the  heart  of  this  research  effort.  All  programs  that  are 
used  calculate  statistics.  All  either  calculate  or  use 
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previously  calculated  critical  values.  The  point  is  that 
all  the  programs  use  much  of  the  same  flow  and  code  to 
accomplish  these  calculations. 

The  four  basic  steps  used  in  the  programs  include 
the  following: 

1.  Calculating  statistics  (using  different  sub¬ 
routines  for  each  statistic), 

2.  Storing  the  statistics  in  a  vector  array, 

3.  Ordering  the  elements  of  that  array  from 
smallest  to  largest,  and 

4.  Calculating  critical  values  using  the  bootstrap 
method  that  was  discussed  in  Chapter  II. 

Subprogram  for  Calculating  Statistics.  The  logic 
for  that  portion  of  each  program  that  deals  with  calculating 
the  statistic  is  shown  in  Fig.  5.  The  letter  on  the  right- 
hand  side  of  each  block  is  the  block  identifier. 

Subprogram  for  Finding  Critical  Values.  The  logic 
for  that  portion  of  each  program  that  is  used  to  find  the 
critical  values  is  shown  in  Fig.  6. 

Testing  the  Program.  The  program  can  be  tested  for 
validity,  since  tables  of  critical  values  for  the  straight¬ 
forward  calculation  of  the  Kolmogorow-Smirnov  (K-S)  statis¬ 
tic  are  readily  available.  With  5000  samples,  the  program 
can  be  run  without  estimating  the  parameters,  i.e.,  assuming 
y  =  0  and  a  =  1.  These  critical  values  can  be  compared  with 
those  obtained  by  Massey  (Massey,  1951,  p.  70).  Once  this 
is  done,  the  program  is  modified  as  shown  in  Fig.  7.  The 


28 


Generate  ordered  random 
deviates  from  Normal  (0,1) 


+ 

Find  CDF  value  for 
each  data  point 

B 

Calculate  K-S 
statistic 

C 

_ _ _ ± _ 

Store  statistic  in  a 
vector  array  of  length  n 

D 

Reiterate  the  above 
flow  n  times 


5.  Subprogram  for  Calculating  Statistics 


Order  the  Array 
of  statistics 

A 

t 

Extrapolate  for  the  0th 
and  n+lst  order  statistics 

B 

+ 

Find  critical  value 
using  bootstrap 

C 

Subprogram  for  Finding  Critical  Values 


B&NMMi 


fc«IH 


Block  A  of 
Figure  5 


4- 


Calculate  x  and  S 
for  sample 

+ 

Calculate 


x .  -  x 
Z  .  =  -1— y - 

i  S 


for  each  data  point  (x^) 


A-2 


Replace  original  data 
point  (xi)  with  z i 


A- 3 


Block  B  of 
Figure  5 


Fig  7.  Program  Logic  for  Standardizing 
the  Data 


three  logic  blocks  in  Fig.  7  fit  between  blocks  A  and  B  of 
Fig.  5.  With  this  modification,  the  program  will  generate 
critical  values  after  estimating  the  parameters  of  the  nor¬ 
mal  by  x  and  S  and  standardizing  the  data.  When  this  is  done 
with  5000  samples,  the  results  can  be  compared  with  those  of 
Lilliefors  (Lilliefors,  1967,  p.  400). 
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The  Number  of  Samples  to  Use.  The  next  research  issue 


to  investigate  is  if  the  bootstrap  method  will  allow  the  use 
of  considerably  less  than  5000  order  statistics  to  calculate 
the  critical  values. 

To  do  this  test,  critical  values  are  calculated  for 
the  K-S  statistic  using  150,  300,  500,  1000,  and  5000  samples. 
All  samples  are  generated  by  Monte  Carlo  simulation  and  using 
different  seeds.  If  the  values  are  essentially  the  same  at 
500,  1000,  and  5000  samples,  then  critical  values  for  tables 
can  be  calculated  using  only  500  samples.  Similarly,  if  the 
values  are  the  same  for  300,  500,  1000,  and  5000,  then  300 
samples  would  be  enough.  The  point  is  that  if  the  researcher 
wants  to  use  300  samples  to  generate  tables  of  critical 
values,  the  critical  values  at  300  must  be  the  same  as  those 
calculated  using  500,  1000,  and  5000.  Five  thousand  samples 
is  the  number  of  samples  commonly  used  in  the  literature  to 
generate  tables.  The  hope  is  that  fewer  will  be  needed  by 
using  the  bootstrap  technique.  Whatever  number  of  samples  are 
used,  however,  must  be  consistent  with  the  results  at  5000 
samples  to  be  acceptable. 

In  addition  to  this  vertical  comparison,  cross  com¬ 
parison  with  critical  values  found  using  different  initial 
seeds  to  the  random  number  generator  are  necessary.  In  one 
vertical  comparison,  the  values  might  be  essentially  the  same 
at  500,  1000,  and  5000  samples.  However,  using  a  different 
seed,  this  may  not  hold  true.  The  only  consistency  might  be 
at  1000  and  5000.  The  critical  values  must  be  consistent 
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for  a  given  number  of  samples--no  matter  what  seed  is  used-- 
if  that  number  of  samples  is  going  to  be  used  to  construct 
valid,  accurate  tables. 

The  program  used  to  test  this  issue  is  included  in 
Appendix  D. 

Program  Subunits  (Author's).  Several  subroutines 
have  been  written  by  the  author  for  use  in  the  various  pro¬ 
grams.  The  code  for  these  subroutines  is  included  in  Ap¬ 
pendix  C.  The  purposes  and  names  of  these  subroutines  are 
discussed  in  the  following  paragraphs. 

Three  subroutines  are  used  in  the  calculation  of  the 
K-S  statistics.  These  are  CVALS ,  LILDIF,  and  DSTAT.  ANDAR 
is  used  to  calculate  the  Anderson-Darling  statistics,  while 
CVM  is  used  to  calculate  the  Cramer- von  Mises  statistics. 

In  addition  to  the  five  above,  four  subroutines  are 
used  in  a  variety  of  programs.  Their  names  and  uses  are 
listed  below: 

ESTPAR  -  Takes  an  input  vector  array  of  data  points 
(x.)  and  calculates  x  and  S.  It  then 
standardizes  the  data  via  the  transforma¬ 
tion, 


and  outputs  a  vector  array  of  standardized 
data  points  ( z^) . 

DUB SAM  -  Takes  an  input  vector  array  of  length  n, 

calculates  the  mean  of  the  vector  elements, 
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reflects  vector  elements  about  the  mean, 
and  generates  an  output  array  of  length  2n, 
which  includes  the  original  array  elements 
plus  their  reflections. 

XPOLAT  -  Used  as  part  of  the  bootstrap  technique. 

Input  is  a  vector  array  of  no  order  statis¬ 
tics.  It  extrapolates  for  and  X^. 

Output  is  an  array  of  length,  n  +  2. 

CVALUE  -  Input  is  an  array  of  order  statistics. 

Output  is  a  set  of  critical  values  based  on 
the  elements  of  that  array. 

Program  Subunits  (IMSL).  In  addition  to  the  author's 
own  subroutines,  several  subroutines  from  the  International 
Mathematical  and  Statistics  Library  (IMSL)  are  used.  These 
include  the  following: 

GGNO  -  Generates  an  array  of  ordered  N(0,1)  random 
deviates . 

MDNOR  -  For  an  input  data  point,  outputs  the  CDF 

value  of  the  standard  normal  distribution. 

VSRTA  -  Orders  the  elements  of  an  input  array  from 
smallest  to  largest. 

Generation  of  Tables  of 
Critical  Values 

Once  the  number  of  samples  needed  to  get  accurate 
critical  values  has  been  determined,  the  next  step  in  the 
research  is  the  generation  of  critical  value  tables.  The 
tables  to  be  generated  are  for  the  Kolmogorov-Smirnov  (K-S), 
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Anderson- Dari ing  (A-D),  and  Cramer-von  Mises  (CVM)  statistics 
for  sample  sizes  n  =  3  through  n  =  60.  The  critical  values 
are  for  modified  statistics-- statistics  calculated  after 
sample  data  points  are  reflected  about  the  sample  mean. 

At  this  point  the  researcher  is  faced  with  a  choice. 
The  choice  is  between  using  a  complex  program  that  produces 
an  entire  table  of  critical  values  or  using  a  simple  program 
and  reiterating  it  for  each  sample  size.  The  second  option 
is  chosen  despite  the  fact  that  it  forces  manual  construction 
of  the  tables.  This  disadvantage  is  outweighed  by  the  much 
more  rapid  computer  turnaround  for  the  simple  program. 

As  a  result  of  using  the  simpler  methodology,  each 
final  table  requires  the  submission  of  171  programs--  57  for 
K-S,  57  for  A-D,  and  57  for  CVM.  These  individual  programs 
are  similar  to  the  one  described  in  the  previous  sections  of 
this  chapter.  The  only  change  is  that  in  these  programs, 
the  samples  are  doubled  by  reflection  about  the  sample  means. 
This  is  done  by  subroutine  DUBSAM  after  generating  the  ran¬ 
dom  deviates  and  before  standardizing  the  data.  This  step 
occurs  between  logic  block  A  of  Fig.  5  and  logic  block  A-l 
of  Fig.  7.  The  program  will  generate  critical  values  for 
a  =  .20,  .15,  .10,  .05,  and  .01  for  a  given  value  of  n. 

In  addition  to  the  above  programs,  twelve  more  are 
required  to  generate  critical  values  for  use  in  the  power 
study.  Since  the  powers  are  to  be  compared  at  n  =  10,  25, 

40,  and  60,  critical  values  at  a  =  .20,  .15,  .10,  .05,  and 
.01  must  be  determined  without  reflecting  the  sample.  This 
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is  done  for  each  of  the  three  statistics  at  each  of  the  above 


four  values  of  n. 

Power  Study 

The  purpose  of  the  power  study  is  to  test  the  research 
hypothesis  that  the  technique  of  reflecting  data  points  about 
their  means  will  result  in  goodness  of  fit  tests  with  higher 
powers  than  ones  which  do  not  use  that  technique. 

The  power  study  is  done  at  n  =  10,  n  =  25,  n  =  40, 
and  n  *  60.  The  reasons  for  using  these  sample  sizes  are 
that  1)  power  comparisons  will  be  available  for  both  small 
and  large  sample  sizes,  and  2)  trends  in  the  behavior  of  the 
statistics'  critical  values  can  be  observed. 

The  logic  of  the  power  study  program  follows.  First, 
a  sample  is  drawn  from  some  distribution  other  than  the  normal. 
Second,  the  test  statistic  is  calculated.  Third,  a  compari¬ 
son  is  made  between  the  test  statistic  and  the  critical  value 
for  each  level  of  a.  If  the  test  statistic  is  greater  than 
the  critical  value,  normality  is  rejected.  The  first  three 
steps  are  then  reiterated  5000  times.  Each  rejection  is 
counted.  The  power  at  each  a- level  is  computed  by  dividing 
the  number  of  rej ections  by  5000 .  The  results  are  then  printed 
out. 

Six  statistics  are  calculated  for  each  value  of  n. 

These  statistics  are  the  following: 

1.  K-S 


2.  K-S  reflected 


A-D 


5 . 

4.  A-D  reflected 

5 .  CVM 

6.  CVM  reflected 

As  with  the  generation  of  tables,  the  choice  is  made 
here  to  submit  simple  programs  and  then  construct  tables 
manually.  Thus,  to  find  the  power  of  the  statistics  for  the 
normal  against  some  other  distribution,  four  programs  are 
required- -one  for  each  value  of  n.  So,  if  seeking  the  power 
against  five  distributions,  twenty  programs  are  required. 
Different  seeds  are  used  for  each  run. 

Flow  of  Typical  Program.  Figure  8  is  a  display  of 
the  logic  of  the  typical  program  used  in  the  power  study. 

The  flow  in  Fig.  8  is  for  finding  the  power  of  each  of  the 
six  statistics  against  the  exponential  distribution  at  sample 
size,  n  =  10.  The  code  for  this  particular  program  is  in¬ 
cluded  in  Appendix  F  as  an  example  of  the  FORTRAN  code  used. 

The  Distributions  Used.  The  distributions  used  in 
this  power  study  are  the  exponential,  Cauchy,  chi-squared 
with  four  degrees  of  freedom,  the  chi- squared  with  one  degree 
of  freedom,  and  the  double  exponential.  The  exponential 
random  deviates  are  generated  by  the  IMSL  subroutine,  GGEXN. 
The  Cauchy  deviates  are  generated  by  GGCAY  (IMSL),  and  the 
chi-squared  ones  are  generated  by  GGCHS  (IMSL). 

The  IMSL  does  not  include  a  subroutine  for  the  double 
exponential.  Therefore,  double  exponential  deviates  are 
generated  using  the  following  technique.  Continuous  uniform 


36 


A 


Generate  sample  of  10 
exponential  random  deviates 


+ 


Calculate  the  six 
statistics 


+ 


Compare  with  their  corresponding 
critical  values  (reject  if 
test  statistic  >  critical  value) 


Count  rejections  at 
each  a-  level 

D 

+ 

Reiterate  steps  A 
through  D  5000  times 

E 

_ + 

Calculate  the  powers 
at  each  a-  level 


+ 


Print  the  number  of 
rejections  and  the  powers 


Fig  8.  Flow  for  Typical  Power  Study  Program 

random  deviates,  IL,  are  generated  by  GGUBS  (IMSL). 
of  the  double  exponential  [F(y^)]  is  as  follows: 


The 


Therefore,  if  IL  <  0.5,  then  =  ln(2LL)  ,  and 

if  U.  >  0.5,  then  y.  =  -ln(2  -  2U . ) . 

1  l  l 

Thus,  y^,  i  =  l,2,...n,  is  a  pseudo- random  sample  from  the 
double  exponential  distribution  (Littel,  McClave,  and  Offen, 
1979  ,  p.  265) . 


Programs  in  the  Appendices 

An  example  of  each  type  of  program  described  in  this 
chapter  is  included  as  an  appendix.  The  following  is  a  list 
of  the  appendices  and  the  type  of  program  or  information 
included  in  each: 

Appendix  A:  COMPAR  -  the  program  for  comparing 
plotting  positions. 


Appendix  B: 


Appendix  C : 


Appendix  D: 


Appendix  E: 


Results  of  COMPAR  -  the  results  of 
program,  COMPAR,  when  150  points  are 
to  be  plotted. 

Subroutines  -  the  computer  code  for 
the  subroutines  written  by  the  author. 
COMLIL  -  the  program  used  to  validate 
the  logic  used  in  finding  critical  values 
for  the  Kolmogorov-Smirnov  statistic. 

This  program  is  used  to  determine  the 
number  of  samples  to  use  for  the  boot¬ 
strap  technique. 

TABLE2  -  the  program  for  finding  criti¬ 


cal  values  of  the  modified  Anderson- 
Darling  statistic. 


Appendix  F:  POWERS  -  the  program  for  finding  the  cri¬ 
tical  values  of  the  six  statistics  at 
sample  size,  n  =  10,  when  the  Cauchy  is 
the  alternative  distribution. 

All  programs  are  written  in  FORTRAN  V  and  are  run  on  the 
Control  Data  Systems  CDC  6600  computer  which  is  operated  by 
the  Aeronautical  Systems  Division  at  Wright- Patterson  AFB,  Ohio. 

Summary 

This  chapter  is  a  presentation  of  the  basic  methodo¬ 
logy  used  in  the  research.  Flow  diagrams  are  used  to  portray 
typical  logic  used  in  the  different  computer  programs.  The 
presentation  includes  discussions  of  1)  how  plotting  posi¬ 
tions  are  compared,  2)  how  statistics  and  critical  values 
are  calculated,  3)  how  the  tables  of  critical  values  are 
generated,  and  4)  how  the  power  study  was  done. 

The  next  chapter  is  a  presentation  of  the  results  of 
this  research. 
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IV.  Results 


This  chapter  is  a  presentation  of  the  results  of  the 
research  procedures  described  in  the  previous  chapter.  First 
to  be  discussed  are  the  results  of  testing  the  three  plotting 
positions.  The  section  on  plotting  positions  is  followed  by 
a  section  which  reports  the  appropriate  number  of  samples  to 
use  when  finding  the  critical  values.  This  is  followed  by 
the  two  major  sections  of  the  chapter- - ones  in  which  the 
tables  of  critical  values  and  the  results  of  the  power  study 
are  presented.  The  chapter  ends  with  a  brief  summary. 

Test  of  Plotting  Positions 

The  purpose  of  this  testing  of  the  plotting  positions 
was  to  determine  if  there  was  any  noticeable  difference  among 
the  three.  The  results  of  the  program  using  n  =  150  (where 
n  is  the  number  of  points  to  be  plotted)  are  included  in 
Appendix  B. 

With  n  =  150,  the  average  of  the  mean  and  mode  ranks 
(AMM)  is  essentially  the  same  as  the  modified  step  rank  (MS). 
The  largest  difference  at  n  =  150  is  2.0  x  10’^.  At  n  =  300, 
the  maximum  difference  is  1  x  10'^. 

In  contrast,  the  differences  between  the  median  rank 
(M)  and  the  other  two  is  larger  (by  a  factor  of  10)  at 
n  *  150.  The  largest  difference  between  AMM  and  M  is 

.  3 

1.34  x  10  .  The  largest  difference  between  MS  and  M  is 
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1.32  x  lO0.  The  difference  is  halved  when  n  =  300. 

Although  the  median  rank  is  different  than  the  other 
two  plotting  positions,  the  difference  is  still  quite  small. 
This  difference  becomes  very,  very  small  as  the  number  of 
points  to  plot  increases.  Because  the  differences  become 
slight  as  n  increases,  the  decision  was  made  to  use  the 
modified  step  rank  in  all  calculations  of  critical  values. 

Test  of  the  Program 

As  a  test,  the  program  for  generating  critical  values 
was  run  with  5000  samples  of  sizes  n  =  10,  n  =  20,  and  n  = 

30.  As  stated  in  Chapter  III,  this  was  done  for  the  Kolmogorov 
Smirnov  statistic  so  that  the  results  could  be  compared  with 
tables  previously  published. 

The  program  which  carried  the  assumption  of  normality, 
with  y  =  0  and  a  =  1,  generated  critical  values  which  were 
the  same  as  Massey's  (Massey,  1951).  When  the  parameters  of 
the  normal  distribution  were  estimated  by  x  and  S,  the  results 
were  similar  to  those  obtained  by  Lilliefors  (Lilliefors,  1967) 
The  program  is,  thus,  valid. 

The  Number  of  Samples  Used 

The  program  for  testing  the  consistency  of  critical 
values  was  run  four  times  with  a  different  seed  each  time. 

The  program  generated  critical  values  using  150,  300,  500, 

1000,  and  5000  samples.  The  only  number  of  samples  that 
yielded  consistent  results  through  all  four  programs  at  all 
levels  of  a  was  5000.  If  a  =  .01  had  not  been  desired  for 
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Tables  of  Critical  Values 

Only  two  complete  tables  of  critical  values  are  pre¬ 
sented.  Table  II  is  a  list  of  critical  values  for  the 
Kolmogorov-Smirnov  statistic  when  the  sample  is  reflected 
about  the  mean.  Table  III  is  the  same  information  for  the 
modified  Anderson-Darling  statistic. 

Only  a  partial  table  is  presented  for  the  Cramer-von 
Mises  statistic.  Table  generation  was  stopped  because  the 
preliminary  results  of  the  power  study  were  not  promising 
for  any  of  the  statistics.  Upon  completion  of  the  power 
study,  it  was  found  that  the  modified  CVM  statistic  was 
rarely  better  than  the  modified  A-D  statistic.  The  decision 
was  made  to  not  waste  computer  resources  generating  a  table  of 
apparently  minimal  utility. 

For  the  power  study,  however,  critical  values  of  the 
Cramer-von  Mises  statistic  were  needed  for  n  =  10,  n  =  25, 
n  =  40,  and  n  =  60.  A  list  of  the  critical  values  at  these 
values  of  n  is  included  as  Table  XV. 

Use  of  the  Tables.  The  following  is  the  sequence  of 
steps  necessary  to  use  Tables  II,  III,  and  XV. 

1.  Collect  data  (sample  size  =  n) 
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2.  Double  the  sample  by  reflection  about  the 
mean  (as  described  in  Chapter  III). 

3.  Standardize  the  data  by  the  following  transfor¬ 
mation  : 

,1  x.  -  x 

zi  =  — s — 


where 

» 

x^  =  the  original  data  point 
z^  =  the  standardized  data  point 
x  =  the  sample  mean 
S  =  the  sample  standard  deviation 

4.  Calculate  the  statistic  (see  Chapter  II). 

5.  Enter  the  table  at  the  desired  a- level  and 
appropriate  value  of  n. 

6.  If  the  statistic  is  greater  than  the  table  value, 
reject  HQ:  the  data  are  from  a  normal  population. 

The  tables  are  located  on  subsequent  pages. 


Power  Study 

The  power  study  was  initially  done  versus  five  con¬ 
tinuous  distributions.  A  power  study  computer  program  was 
also  run  using  standard  normal  random  deviates  to  validate 
the  study.  The  following  is  a  list  of  the  distributions  used 
and  their  corresponding  tables: 

1.  Exponential  (Table  IV) 

2.  Cauchy  (Table  V) 

3.  Chi-squared  with  one  degree  of  freedom  (Table  VI) 

4.  Chi-squared  with  four  degrees  of  freedom  (Table  VII) 

5.  Double  exponential  (Table  VIII) 
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Notes  About  the  Tables.  Several  things  should  be 
noted  about  the  tables.  The  first  note  is  explanatory.  The 
column  headed  "calculation  method"  has  two  symbols  listed. 

The  use  of  a  single  asterisk  (*)  indicates  that  the  powers 
in  that  row  are  for  straightforward  calculation  of  the  sta¬ 
tistic.  The  use  of  a  double  asterisk  (**)  indicates  that 
the  powers  in  that  row  are  for  calculation  of  the  statistic 
after  doubling  the  sample  by  reflection  about  the  arithmetic 
mean  of  the  original  sample. 

The  second  item  of  note  is  that  when  the  power  is 
greater  when  the  reflection  technique  is  used  versus  when 
straightforward  calculation  is  used,  the  power  in  the  (**) 
row  is  underlined. 

The  third  point  is  that  if  one  peruses  Tables  IV 
through  VIII,  he  will  not  find  very  many  instances  when  the 
doubled  asterisked  power  is  underlined.  When  it  is  under¬ 
lined,  it  is  for  a  symmetric  distribution.  In  the  case  of 
the  Cauchy  (Table  V],  one  will  notice:  1)  that  there  is 
minimal  power  improvement  and  2)  that  improvement  is  with 
large  sample  sizes.  Most  improvement  is  seen  with  the  double 
exponential,  although  still  only  with  relatively  large  sample 
sizes  (Table  VIII). 

More  Distributions.  Because  the  improved  power 
appeared  to  be  against  symmetrical  unimodal  distributions, 
it  was  decided  to  do  additional  power  studies  with  the 
logistic  and  Student's  t  with  three  degrees  of  freedom.  A 
study  was  done  against  the  uniform  just  to  see  what  would 
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TABLE  II 


Critical  Values  of  the  Modified  Kolmogorov- Smirnov 
Statistic  for  the  Normal  Distribution 
(Parameters  Estimated  from  the  Sample) 


n 

.20 

.15 

a-  level 

— Tnr 

.05 

.01 

3 

.20144 

.21133 

.22155 

.23119 

.23832 

4 

.19040 

.20749 

. 22981 

.25958 

. 29949 

5 

.18600 

.19626 

.20955 

.22566 

. 25164 

6 

. 16981 

. 17746 

.18786 

.20432 

.24933 

7 

.15883 

.  16973 

.18258 

.20179 

.23340 

8 

.14923 

.15861 

.16924 

. 18591 

.21416 

9 

.14279 

.15057 

.16106 

.17807 

.20894 

10 

.13452 

.14278 

.15309 

.  16858 

.20295 

11 

.12990 

.13734 

.14731 

.16308 

.19580 

12 

.12535 

. 13225 

. 14163 

.15788 

.18677 

13 

.12046 

.12708 

.13649 

.15045 

. 18021 

14 

.11654 

.12344 

.13171 

.14519 

.17046 

15 

.11272 

.11924 

.12818 

.14087 

.16812 

16 

.10893 

. 11483 

.12334 

.  13621 

.16301 

17 

.10721 

.11287 

.12029 

.13293 

.15758 

18 

.  10334 

.  10929 

.  11811 

.12912 

. 15234 

19 

.10152 

.10724 

.11447 

.12630 

.15069 

20 

.09938 

. 10500 

.11187 

.  12217 

.14629 

21 

.09732 

.10321 

.11096 

.12284 

.14452 

22 

.09416 

.09965 

. 10693 

. 11735 

.  13980 

23 

.09337 

.09849 

.  10548 

.11523 

.13596 

24 

.09005 

.09543 

.10246 

.11350 

.13664 

25 

.08818 

.09328 

.09931 

.11045 

.13294 

26 

.08777 

.09302 

.09986 

. 11062 

.  13309 

27 

.08608 

.09120 

.09780 

.10760 

.12760 

28 

.08498 

.08957 

.09583 

.  10612 

.12750 

29 

.08254 

.08753 

.09404 

.  10381 

.12427 

30 

.08144 

.08635 

.09190 

.10019 

.  12038 

TABLE  II,  continued 


n 

.20 

.15 

a-  level 
.10 

.05 

.01 

31 

.07965 

.08393 

.09045 

.09957 

. 11916 

32 

.07892 

.08361 

. 08945 

.09986 

.11821 

33 

.07734 

.08170 

.08806 

.09756 

. 11564 

34 

.07769 

.08144 

.08750 

.09657 

.11538 

35 

.07582 

.08-003 

.08570 

.09352 

.11001 

36 

.07436 

.07874 

.08381 

.09204 

. 10847 

37 

.07389 

.07808 

.08367 

.09170 

.10814 

38 

.07355 

.07781 

.08314 

.09134 

. 10898 

39 

.07137 

.07534 

.08076 

.08934 

.10689 

40 

.07103 

.07530 

.08069 

.08895 

.10428 

41 

.07001 

.07409 

.07905 

.08811 

.10513 

42 

.06954 

.07352 

.07928 

.08722 

.10372 

43 

.06838 

.07225 

.07712 

.08524 

.10186 

44 

.06768 

.07160 

.07741 

.08539 

.10215 

45 

.06721 

.07137 

.07680 

.08435 

.  10097 

46 

.06683 

.07025 

.07503 

.08331 

.09976 

47 

.06623 

.06991 

.07443 

.08182 

.09830 

48 

.06511 

.06904 

.07465 

.08186 

.09744 

49 

. 06374 

.06781 

.07295 

.08038 

.09496 

50 

.06363 

.06690 

.07226 

.08051 

.09377 

51 

.06347 

.06766 

.07239 

.08056 

.09575 

52 

.06253 

.06614 

.07085 

.07849 

.09352 

53 

.06205 

.06534 

.07006 

.07751 

.09103 

54 

.06153 

.06488 

.06965 

.07704 

.09189 

55 

. 06111 

.06477 

.  06941 

.07682 

.09342 

56 

.06070 

.06433 

.06902 

.07631 

.09068 

57 

.05938 

.06299 

.06739 

.07477 

.08903 

58 

.05995 

.06304 

.06773 

.07561 

.09026 

59 

.05923 

.06249 

.06703 

.07393 

.08664 

60 

.05828 

.06166 

.06608 

.07301 

.08669 

46 


TABLE  III 


Critical  Values  of  the  Modified  Anderson-Darling 
Statistic  for  the  Normal  Distribution 
(Parameters  Estimated  from  the  Sample) 


n 

.20 

.15 

a-  level 
.  10 

.05 

.  01 

3 

.32197 

.33708 

.35211 

.38220 

.41027 

4 

.38954 

.44834 

. 54657 

.71621 

1.01573 

5 

.41998 

.47814 

.55492 

.65032 

.74460 

6 

.40458 

.44691 

.50501 

.62504 

.95894 

7 

.41369 

.46504 

.54431 

.67666 

.94119 

8 

.42964 

.47950 

.55208 

.69138 

.94023 

9 

.43903 

.49149 

.  70801 

.70801 

1.06640 

10 

.44203 

.50275 

.57780 

. 71245 

1.05927 

11 

.44488 

.50165 

.  59112 

.73335 

1.05248 

12 

.43843 

.49746 

.57663 

.73101 

1.08325 

13 

.04478 

.49320 

.57152 

.72240 

1.03260 

14 

.44722 

.50810 

.59214 

.74576 

1.07173 

15 

.45345 

.51024 

.59637 

.74049 

1.11475 

16 

.45242 

.51498 

.60171 

. 76214 

1.14640 

17 

.46114 

.51875 

.60122 

. 76680 

1.17257 

18 

.44973 

.50160 

.58093 

.73484 

1.12748 

19 

.44482 

.51126 

.  59408 

. 75451 

1.12149 

20 

.46305 

.52665 

.60590 

.  74583 

1.05373 

21 

.45638 

.51138 

.  59375 

. 75196 

1.08071 

22 

.45134 

.  50675 

.58571 

. 74983 

1.15273 

23 

.46409 

.53008 

.62019 

.76239 

1.11597 

24 

.45368 

.51731 

.59381 

.76619 

1.07996 

25 

.45905 

.52107 

.61331 

. 76620 

1.16808 

26 

.45657 

.51742 

.60062 

.76192 

1.14596 

27 

.46406 

.52359 

.61168 

.75096 

1.09041 

28 

.45768 

.52019 

.61730 

. 77880 

1.17745 

29 

.45206 

.51637 

.60195 

.74786 

1.09862 

30 

.45293 

.50558 

.  59092 

.74476 

1.10891 
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TABLE  III,  continued 


n 

.20 

.15 

a- level 
.10 

.05 

.01 

31 

.45543 

.51505 

.60163 

.74662 

1.12114 

32 

.46263 

.52015 

. 60162 

.74479 

1.13510 

33 

.46745 

.53163 

.63275 

. 77983 

1.18250 

34 

.46426 

.52163 

.60482 

.76331 

1.18852 

35 

.45729 

.52079 

.60769 

.  78193 

1.21168 

36 

.45770 

.51452 

.60661 

.  76218 

1 . 13499 

37 

.46177 

.52033 

.61215 

.  76645 

1.12664 

38 

.46404 

.  52357 

. 61461 

.  764  17 

1.13875 

39 

.45700 

.51812 

. 60254 

.  76641 

1.17802 

40 

.46694 

.53085 

.61657 

. 77795 

1.15300 

41 

.45551 

.51705 

.61321 

.  76581 

1.15760 

42 

.47180 

.53721 

.62564 

.  79213 

1.24298 

43 

.46384 

.51566 

.60146 

. 75147 

1.08809 

44 

.47371 

.54434 

.64197 

.80843 

1.16483 

45 

.45976 

.52153 

.61198 

.77735 

1.16194 

46 

.46785 

.52882 

.62071 

.  76338 

1.21628 

47 

.46490 

.52129 

.60793 

.77487 

1.12421 

48 

.47582 ‘ 

.53170 

.61815 

. 77982 

1.17364 

49 

.48063 

.54400 

.63094 

.78997 

1.19872 

50 

.47218 

.53726 

.63046 

. 79205 

1.16489 

51 

.47487 

. 53872 

.62897 

. 78620 

1.21199 

52 

.47148 

.53079 

.61907 

.  77005 

1.12364 

53 

.47112 

.54007 

.  63222 

. 79662 

1.15710 

54 

.46084 

.51778 

.61277 

.  7S934 

1.16752 

55 

.47508 

.53724 

.63276 

.77438 

1.18181 

56 

.46565 

.52901 

.62268 

.78599 

1.17529 

57 

.45185 

. 51522 

.60231 

.78654 

1.15393 

58 

.46904 

.53178 

.60645 

.77380 

1.  13277 

59 

.47571 

.54371 

.62089 

.76709 

1.21326 

60 

.47305 

.53179 

.61611 

.78893 

1.19363 
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TABLE  IV 


Powers  for  Testing  Hq  :  Population  Is  Normal, 
When  Population  Is  Exponential 


Actual  Population 

:  Exponential 

Statistic : 

Kolmogorov- Smirnov 

Calculation  method 

Powers 

at  a-levels 

n 

*  “straightforward 
**  “reflection 

.20 

.15 

.  10 

.05 

.01 

10 

a 

.5710 

.5120 

.4318 

.3208 

.1612 

** 

.  3670 

.2914 

.2206 

.1388 

.0390 

25 

* 

.8914 

.  8528 

.  7960 

.6882 

.4536 

25 

** 

.  5992 

.5262 

.4474 

.3216 

.1532 

40 

* 

.9828 

.9752 

.9556 

.9074 

.  7204 

40 

** 

.7716 

.7120 

.6318 

.5140 

.3202 

60 

* 

.9994 

.9984 

.9960 

.9838 

.9312 

60 

** 

.9100 

.8752 

.8196 

.  7226 

.5176 

Statistic:  Anderson-Darling 

* 

.  6688 

.6120 

.  5282 

.4120 

.2356 

10 

** 

.3782 

.3104 

.2396 

.1668 

.0616 

25 

* 

.9668 

.9550 

.9328 

.8854 

.7244 

25 

** 

.  6656 

.6036 

.5152 

.3932 

.1928 

40 

* 

.9980 

.9962 

.9928 

.9840 

.9444 

40 

** 

.8428 

.  7924 

.  7206 

.6016 

.3730 

60 

* 

1.0000 

1.0000 

.9998 

.9994 

.9948 

60 

** 

.9412 

.9172 

.  8798 

.  7898 

.5556 

Statistic 

Cramer -von 

Mises 

* 

.6306 

.5764 

.4944 

.3842 

.1970 

** 

.3502 

.2818 

.2134 

.1318 

.0494 

25 

* 

.9400 

.9214 

.  8910 

.  8238 

.6552 

25 

** 

.  5932 

.5184 

.4194 

.2884 

.1328 

40 

* 

.9950 

.  9906 

.9838 

.9656 

.8952 

40 

** 

.  7432 

.6738 

.  5810 

.4528 

.  2354 

60 

* 

.9998 

.9998 

.9992 

.9980 

.9876 

60 

A* 

.  8790 

.8348 

.  7588 

.6228 

.3582 
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TABLE  V 


Powers  for  Testing  Hq :  Population  is  Normal, 
When  Population  Is  Cauchy 


Actual  Population:  Cauchy 

Statistic 

Kolmogorov -Smirnov 

n 

Calculation  method 
*  =  straightforwarc 
**  =  reflection 

.20 

Powers 

.15 

at  a- 

.10 

levels 

.05 

.01 

10 

* 

.  7306 

.6998 

.6532 

.5884 

.4660 

10 

** 

.6442 

.6010 

.  5476 

.4732 

.3312 

25 

* 

.9558 

.9452 

.9298 

.9000 

.8385 

25 

*  * 

.9532 

.9432 

.9328 

.9066 

.8372 

40 

* 

.9918 

.9888 

.9862 

.9766 

.9498 

40 

** 

.9934 

.9922 

.9898 

.9838 

.9682 

60 

* 

.9994 

.9990 

.9986 

.9970 

.9926 

60 

*  * 

1.0000 

.9998 

.9994 

.9990 

.9976 

_ , 

Statistic:  Anderson-Darling 

10 

* 

.7452 

.7132 

.6688 

.6082 

.5010 

10 

** 

.6478 

.6042 

.  5634 

.  5064 

.  3838 

25 

* 

.9662 

.9610 

.9524 

.9358 

.8870 

25 

it  it 

.9618 

.9538 

.9414 

.9246 

.8708 

40 

* 

.9946 

.9936 

.9924 

.9884 

.9740 

40 

it  * 

.9950 

.9942 

.99  30 

.9892 

.9788 

60 

* 

1.0000 

1.0000 

.9998 

.9994 

.9976 

60 

** 

1.0000 

1.0000 

1.0000 

.9998 

.9990 

Statistic:  Cramer-von 

Mises 

10 

* 

.  7436 

.  7090 

.6658 

.6104 

.4816 

10 

** 

.6456 

.6088 

.  5582 

.4878 

.3778 

25 

* 

.9644 

.9578 

.9468 

.9294 

.  8826 

25 

** 

.9608 

.9508 

.9404 

.9196 

.  8724 

40 

* 

.9950 

.9936 

.9922 

.9874 

.9730 

40 

** 

.9954 

.9936 

.9926 

.9896 

.9782 

60 

* 

1.0000 

.9998 

.9998 

.9992 

.9980 

60 

** 

1.0000 

1.0000 

1.0000 

.9996 

.9986 

50 


TABLE  VI 


Powers  for  Testing  Hn:  Population  Is  Normal, 

u  2 

When  Population  Is  x  (1  d.f.) 


Actual  Population 

- T~ 

:  X 

Cl  d.f.) 

Statistic 

Kolmogorov- 

Smirnov 

n 

Calculation  method 
*=straight forward 
**=reflection 

.20 

Powers  at  a-levels 

.15  .10  .05 

.01 

* 

.7850 

.  7366 

.6608 

.5420 

.3430 

** 

.  5762 

.4994 

.  4046 

.  2854 

.1262 

* 

.  9904 

.9860 

.9738 

.9492 

.8484 

** 

.9262 

.  9000 

.  8616 

.  7692 

.5636 

40 

* 

1.0000 

1.0000 

.9996 

.9990 

.9864 

40 

A  A 

.9896 

.9844 

.  9726 

.9490 

.  8570 

60 

* 

1.0000 

1.0000 

1.0000 

1.0000 

.9998 

60 

A  A 

.9996 

.9990 

.  9984 

.9950 

.9775 

Statistic:  Anderson-Darling 

* 

.8818 

.8418 

.  7832 

.6822 

.4952 

A  A 

.5796 

.5000 

.4124 

.2988 

.1378 

25 

* 

.9994 

.9992 

.99  80 

.9930 

.9662 

25 

A  A 

.9314 

.9066 

.  8600 

.  7654 

.5228 

40 

* 

1.0000 

1.0000 

1.0000 

1.0000 

.9994 

40 

** 

.9918 

.9862 

.9746 

.9430 

.8238 

60 

* 

1.0000 

1.0000 

1.0000 

1.0000 

1.0000 

60 

** 

.9996 

.9994 

.  9986 

.9956 

.9650 

Statistic:  Cramer-von 

Mises 

* 

.8512 

.8122 

.  7444 

.6476 

.4356 

** 

.  5436 

.4754 

.  3780 

.  2508 

.1126 

25 

* 

.  9978 

.9964 

.9922 

.9818 

.9398 

25 

** 

.9042 

.8670 

.  7960 

.  6794 

.4422 

40 

* 

1.0000 

1.0000 

1.0000 

.9998 

.9982 

40 

** 

.9798 

.9690 

.9434 

.  8926 

.7218 

60 

* 

1.0000 

1.0000 

1.0000 

1.0000 

1.0000 

60 

** 

.9990 

.9984 

.9952 

.9834 

.9044 

TABLE  VII 


Calculation  method 
n  *astraightforward 
**=ref lection 


Powers  for  Testing  H„:  Population  Is  Normal, 

u  2 

When  Population  Is  x  (4  d.f.) 


Actual  Population: 


Statistic:  Kolmogorov-Smirnov 


Powers  at  a-levels 
.15  .10  .05 


.4138 

.2694 

.  3488 
.2064 

.2716 

.1406 

.1806 

.0740 

.0708 

.0112 

.6566 
.  3470 

.  5974 
.2746 

.5146 
.  2126 

.3872 

.1190 

.1932 

.0394 

.8340 

.4106 

.7910 

.3362 

.  7248 
.  2582 

.6036 

.1612 

.  3548 
.0652 

.9348 

.4840 

.9132 

.4092 

.8640 

.3228 

.  7648 
.2176 

.5392 

.0922 

:  Anderson-Darling 

.4768 
.  2794 

.4020 

.2168 

.  3068 
.  1586 

.  2162 
.  0920 

.0980 

.0208 

.  7948 
.4296 

.7520 

.3616 

.  6884 
.  2764 

.  5804 
.  1792 

.3412 

.0634 

.9394 

.5370 

.9164 

.4698 

.  8820 
.3910 

.8102 
.  2730 

.6198 

.1308 

.9888 

.6458 

.9824 

.5812 

.9682 

.5006 

.9320 

.3720 

.8075 

.1894 

Cramer-von 

Mises 

.4398 

.2630 

.3696 

.2060 

.2870 

.1438 

.1976 

.0774 

.0784 

.0160 

.7402 
.  3698 

.6830 

.3014 

.6122 

.2194 

.4928 

.1308 

.2852 

.0494 

.9086 

.4200 

.8758 

.3536 

.  8244 
.2758 

.  7288 
.1830 

.5248 
.  0770 

.9752 
.  5152 

.9622 

.4384 

.9416 
.  3506 

.  8946 
.  2330 

.  7506 
.0922 

TABLE  VIII 


Powers  for  Testing  Hq :  Population  Is  Normal, 
When  Population  Is  Double  Exponential 


Actual  Population: 

Double  Exponential 

Statistic: 

Kolmogorov -Smirnov 

n 

Calculation  method 
*=straight forward 
***ref lection 

.20 

Powers 

.15 

at  a-levels 

.10  .05 

.01 

a 

.  3604 

.3030 

.  2376 

.1572 

.0646 

10 

** 

.  2874 

.  2306 

.1698 

.1090 

.0330 

25 

A 

.  5084 

.4402 

.  3618 

.2566 

.1196 

25 

** 

.  5098 

.4528 

.3912 

.2862 

.1354 

40 

A 

.  6376 

.  5858 

.5114 

.  3852 

.  1820 

40 

AA 

.6702 

.6150 

.5440 

.4312 

.  2616 

60 

A 

.  7536 

.  7036 

.6264 

.4816 

.2664 

60 

AA 

.  8088 

.  7646 

.  7046 

.6004 

.4020 

_ 

Statistic:  Anderson-Darling 

A 

.  3728 

.3170 

.2414 

.1636 

.0664 

10 

A* 

.2716 

.2172 

.1708 

.1140 

.0368 

25 

A 

.  5566 

.5136 

.4418 

.3440 

.1742 

25 

** 

.  5358 

.4792 

.4094 

.3258 

.1724 

40 

A 

.6958 

.6444 

.5794 

.4846 

.2914 

40 

AA 

.7012 

.6558 

.5970 

.4994 

.3180 

60 

A 

.8072 

.  7627 

.6978 

.  5932 

.  3784 

60 

AA 

.8316 

.8016 

.  7558 

.6724 

.4776 

1 

Statistic:  Cramer-von 

Mises 

10 

A 

.3696 

.3130 

.2416 

.1592 

.0546 

10 

AA 

.  2770 

.  2252 

.1708 

.1104 

.0428 

25 

A 

.  5456 

.4844 

.4206 

.3106 

.1608 

25 

A  A 

.  5364 

.4838 

.4098 

.  3208 

.1802 

40 

A 

.6896 

.6408 

.5688 

.4544 

.2646 

40 

AA 

.6878 

.6476 

.5890 

.4986 

.3210 

60 

A 

.8014 

.  7626 

.7060 

.6124 

.3964 

60 

AA 

.  8390 

.8018 

.  7494 

.  6678 

.4726 
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TABLE  IX 

Powers  for  Testing  H^:  Population  Is  Normal, 
When  Population  is  Logistic 


Actual  Population:  Logistic 


Statistic : 

Kolmogorov -Smirnov 

n 

Calculation  method 
*= straight forward 
**=ref lection 

Powers  at  a- levels 

.20  .15  .10  .05  .01 

.  2486 

.1990  .] 

.  2328 

.1752  .1 

.  2670 

.2150  .] 

.2774 

.2174  .] 

2958 

3052 

3306 

3542 


2450 

2452 

2736 

2990 


874 

.0252 

676 

.0118 

876 

.0244 

932 

.0292 

1798 

1826 

1990 

2306 


1044 

,1094 


Statistic:  Anderson-Darling 


Statistic:  Cramer-von  Mises 


.  2358 

.1892 

.1326  .( 

.  2162 

.1692 

.1164  .( 

.  2792 

.2208 

.1712  .: 

.  2858 

.2284 

.1622  .( 

.3200 

.2612 

.2002  .] 

.3104 

.  2590 

.1938  .] 

.  3552 

.3048 

.2384  .] 

.3816 

.3276 

.2582  .: 

0312 

0412 


.2498 

.1950 

.1374 

.0806 

.  0244 

.2158 

.1634 

.1126 

.0638 

.0128 

.0434 

.2970 

.  2562 

.  1982 

.1274 

.  2866 

.2318 

.1660 

.1058 

.0370 

.  3378 

.2750 

.2208 

.1536 

.0556 

.3264 

.2734 

.2156 

.1412 

.0600 

.3848 

.3248 

.2458 

.1652 

.0596 

.  3996 

.3460 

.  2784 

.1916 

.0860 

.0592 
.  0764 


54 


*&•  ••**  **■  ,t  >^-AwAiafec»V^-  J  ,*Nu*Jilfc„ 


TABLE  X 

Powers  for  Testing  H^:  Population  is  Normal, 
When  Population  is  Student's  t  (3  d.f.) 


Actual  Population:  Student's  t  (3  d.f.) 


Statistic:  Kolmogorov- Smirnov 


Calculation  method 
n  *=straightforward 
**=ref lection 


Powers 

at  a- 

level 

.20 

.15 

.  10 

.05 

.01 

.  3610 
.3028 

.3066 

.2498 

.2500 

.1836 

.1726 

.1174 

.0838 

.0378 

.  5138 
.  5232 

.4596 

.4646 

.3866 

.4066 

.3004 

.3200 

.1700 

.1750 

.6324 

.6646 

.5892 

.6124 

.5132 
.  5468 

.4140 

.4632 

.2482 
.  3288 

.6356 

.5282 

.3632 

;  .7162 

.  6354 

.4752 

Statistic:  Anderson-Darling 


.3778 
.  3026 

.3214 
.  2524 

.2612 

.1994 

.1884 
.  1356 

.0990 

.0552 

.  5734 
.5636 

.5366 

.5124 

.4798 

.4482 

.4006 

.3712 

.  2530 
.2374 

.7054 
.  7124 

.6634 

.6734 

.6148 

.6226 

.  5352 
.5446 

.3788 

.4076 

.  8202 
.  8398 

.  7868 
.  8146 

.  7376 
.7820 

.6614 

.7130 

.5040 

.5766 

Statistic:  Cramer-von  Mises 


.3622  .3122  .2524 

.3020  .2490  .1942 

.5470  .4964  .4376 

.5500  .4998  .4310 

.6866  .6418  .5832 

.6924  .6476  .5904 


.  7962 

.7618  .: 

.  8256 

.7964  .' 

.4842 

.5192 


.  3332 
.3848 


.4892 
.  5464 


TABLE  XI 


Powers  for  Testing  Hq  :  Population  Is  Normal, 
When  Population  is  Uniform  (Continuous) 

Actual  Population:  Uniform  (Continuous) 


Statistic:  Kolmogorov-Smirnov 


Calculation  method 

— 

Powers 

at  a- 

level 

*= straightforward 
**=ref lection 

.20 

.15 

.10 

.05 

.01 

* 

.  2688 

.2112 

.  1420 

.0724 

.0128 

** 

.  3418 

.  2696 

.1946 

.1116 

.  0250 

* 

.  3704 

.2998 

.2156 

.1172 

.0294 

** 

.  5718 

.  4900 

.4078 

.  2596 

.  0856 

* 

.  5284 

.4482 

.3424 

.1978 

** 

.7204 

.6546 

.5542 

.1886 

* 

.6800 

.6012 

.4918 

.3038 

** 

.  8790 

.  8328 

.7586 

.6162 

ggj| 

Statistic:  Anderson-Darling 

* 

.3160 

.2428 

.1596 

.0768 

.0128 

** 

.  3584 

.  2828 

.2226 

.1386 

.0250 

* 

.  5570 

.4890 

.3866 

.2500 

.  0690 

** 

.6814 

.6122 

.5190 

.3746 

.  1384 

* 

.  7572 

.6874 

.5910 

.4414 

.1780 

** 

.8636 

.8182 

.7506 

.6158 

.3442 

* 

.9178 

.8816 

.8038 

.6670 

.3310 

** 

.9708 

\ 

.9218 

.8478 

.6038 

Statistic:  Cramer-von 

Mises 

\ 

* 

.2194 

.1450 

.0690 

.0102 

** 

.3338 

.  2798 

.2098 

.  1216 

.0322 

* 

.4724 

.  3856 

.2890 

.1736 

.0476 

** 

.6244 

.  5530 

.4528 

.3182 

.1250 

* 

.6602 

.  5806 

.4754 

.3120 

.1040 

** 

.  7842 

.  7226 

.6434 

.5120 

.  2602 

* 

.8194 

.7732 

.6950 

.5416 

.2366 

** 

.9262 

.8960 

.  8454 

.7334 

.4588 

TABLE  XII 


Critical  Values  Used  in  the  Power  Study  for 
the  Unmodified  Kolmogorov- Smirnov  Statistic 


n 

.20 

.15 

a- level 
.10 

.05 

.01 

10 

.21595 

.22547 

.  23857 

.  25841 

.  29564 

2S 

. 14388 

.  15070 

.  15990 

. 17370 

.19991 

40 

.11442 

.11937 

.  12631 

.13792 

.16200 

60 

.09443 

.09871 

.10489 

.  11506 

.  13275 

TABLE  XIII 

Critical  Values  Used  in  the  Power  Study  for 
the  Unmodified  Anderson-Darling  Statistic 


n 

.20 

.15 

a- level 
.10 

.05 

.01 

10 

.46452 

.51170 

.58377 

68950 

.90866 

25 

.49224 

.53019 

.59532 

70333 

.98629 

40 

.50112 

. 55038 

.61634 

72494 

.99653 

60 

.50662 

. 55866 

.63651 

76620 

1.06946 

TABLE  XIV 

Critical  Values  Used  in  the  Power 
the  Unmodified  Cramer-von  Mises 

Study  for 

Statistic 

n 

.20 

.15 

a- level 
.10 

.05 

.01 

10 

.07821 

.08720 

.10042 

12058 

.17031 

25 

.08110 

.09046 

.10300 

12522 

.17525 

40 

.08059 

.08940 

.10275 

12604 

.17781 

60 

.08221 

.09052 

.10270 

12414 

.17726 

TABLE  XV 

Critical  Values  Used  in  the  Power  Study  for 
the  Modified  Cramer-von  Mises  Statistic 


n 

.20 

.15 

a- level 
.  10 

.05 

.01 

10 

.07142 

.08137 

.  09636 

.12401 

.18158 

25 

.07181 

.08269 

.09921 

.  12744 

.19170 

40 

.07477 

.08630 

.  10250 

.12890 

.19513 

60 

. 07370 

.08433 

.10076 

.13065 

.20545 

happen. 

The  results  using  these  three  additional  distribu¬ 
tions  are  included  in  Tables  IX,  X,  and  XI,  respectively. 

Critical  Values  Used.  The  critical  values  used  in 
the  power  study  for  the  modified  K-S  and  A-D  statistics  are 
the  ones  in  Tabl  s  II  and  III  at  n  =  10,  25,  40,  and  60. 

The  critical  values  for  the  CVM  statistic  modified  by  reflec¬ 
tion  are  in  Table  XV. 

The  critical  values  used  for  the  straightforward 
calculation  of  the  statistics  are  in  Tables  XII,  XIII,  and 
XIV.  Tables  XII  through  XV  were  all  generated  using  5000 
samples.  This  last  set  of  tables  is  included  for  informa¬ 
tional  purposes  only.  The  author  does  not  claim  that  inter¬ 
polation  can  be  done  for  sample  sizes  not  shown. 

Summary 

This  chapter  is  essentially  a  collection  of  tables 
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with  explanatory  comments.  The  tables  display  the  important 
results  of  this  research  effort.  The  next  chapter  is  a  short 
discussion  of  the  conclusions  to  be  drawn  from  these  results 
and  of  any  implications  for  further  research. 


V.  Conclusions  and  Recommendations 

This  chapter  is  a  presentation  of  the  author's  con¬ 
clusions  concerning  his  research  and  his  recommendations  for 
further  research  with  the  modified  statistics.  First,  a 
review  of  Schuster's  (1973;  1975)  ideas  which  apply  here 
will  be  presented  along  with  a  restatement  of  the  general 
research  hypothesis.  Second,  conclusions  about  how  the 
actual  results  compare  with  the  hypothesized  results  are 
presented.  In  the  same  section,  conclusions  are  stated  con¬ 
cerning  the  "best"  plotting  position  and  the  "best"  number 
of  samples  to  use  for  the  bootstrap  technique  of  determining 
critical  values. 

Review 

The  purpose  of  this  research  has  been  to  test  the 
technique  of  reflecting  data  points  about  the  arithmetic 
mean  before  calculating  previously  developed  goodness  of  fit 
test  statistics.  This  concept  was  motivated  by  work  done  by 
Schuster  (1973;  1975).  The  idea  that  samples  can  be  reflected 
about  the  mean  is  his.  He  used  the  concept  to  develop  a  dif¬ 
ferent  statistic  than  the  ones  which  are  presented  and  studied 
in  this  paper.  Schuster,  however,  predicted  that  the  reflec¬ 
tion  concept  would  be  helpful  when  testing  within  the  set  of 
symmetrical  distributions  (Schuster,  1973).  He  also  showed 
that  when  the  parameters  are  unknown  and  when  testing  within 
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the  set  of  symmetrical  distributions,  the  statistic  he 
developed  would  be  asymptotically  better  than  statistics 
calculated  without  incorporating  some  kind  of  reflection 
technique  (Schuster,  1975).  Schuster  further  demonstrated 
that  his  statistic  would  not  show  improvement  when  testing 
a  symmetrical  versus  a  non- symmetrical  distribution  (Schuster, 
1973)  . 

Since  the  statistics  studied  here  are  also  based  upon 
the  same  type  of  reflection,  it  was  expected  that  using  the 
normal  as  the  hypothesized  distribution,  1)  improved  power 
would  be  evident  when  deviates  from  other  symmetrical  distri¬ 
butions  were  tested,  2)  when  improved  power  was  evident,  it 
would  be  more  evident  as  sample  size  increased  (i.e., 
asymptotically  better)  ,  and  3)  no  improvement  would  be  evi¬ 
dent  in  powers  generated  against  the  non- symmetric  distribu¬ 
tions  . 

The  general  hypothesis  used  to  guide  the  research 
was  stated  in  Chapter  I: 

For  the  normal  distribution,  the  K-S,  A-D,  and  CVM 
statistics,  modified  by  calculation  after  doubling 
the  sample  by  reflecting  data  points  about  the  sample 
mean,  provide  more  powerful  tests  of  goodness  of  fit 
than  do  the  same  statistics  calculated  without  re¬ 
flection. 

Conclusions 

Primary  Research.  Although  the  three  new  statistics 
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tested  in  this  thesis  are  not  identical  to  Schuster's,  the 
predictions  made  based  upon  his  work  are  valid.  The  powers 
calculated  for  symmetrical  alternatives  to  the  normal  are 
asymptotically  greater  for  the  three  modified  statistics 
than  for  the  corresponding  unmodified  statistics.  Also,  the 
powers  for  the  three  new  statistics,  when  calculated  for 
non- symmetrical  alternatives  to  the  normal,  are  lower  than 
for  their  unmodified  counterparts.  This  can  be  seen  in  the 
power  study  tables  of  Chapter  IV. 

The  general  research  hypothesis  is  only  partially 
valid.  The  modified  statistics  are  not  universally  of 
higher  power  than  their  unmodified  counterparts.  Higher 
powers  are  evident  only  for  larger  sample  sizes  (n  >_  25  in 
some  instances,  n  _>  40  in  most  instances)  when  continuous 
symmetrical  alternatives  are  tested.  The  only  alternative 
distribution  for  which  the  modified  statistics  display 
higher  power  for  all  sample  sizes  is  the  continuous  uniform. 
Thus,  the  research  hypothesis  is  false  with  (continuous) 
non- symmetrical  alternative  distributions,  partially  true 
for  (continuous)  symmetrical  alternatives,  and  true  when  the 
alternative  distribution  tested  is  the  (continuous)  uniform. 

The  problem  implied  by  these  conclusions  is  that 
the  applicability  of  the  statistical  tables  generated  is 
limited.  It  is  the  author's  conclusion  that  the  tables  are 
useful  when  it  has  already  been  determined  (or  is  highly 
suspected)  that  the  population  from  which  the  sample  is  drawn 
is  distributed  symmetrically.  Even  with  symmetrical 
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distributions,  the  tables  are  only  useful  for  larger  sample 
sizes.  The  only  distributions  for  which  the  power  with  the 
modified  statistics  is  substantially  greater  are  the  double 
exponential,  Student's  t  with  three  degrees  of  freedom,  and 
the  uniform. 

Another  thing  the  analyst  should  consider  before 
using  these  new  statistics  is  whether  the  significant  losses 
of  power  against  non-symmetrical  distributions  are  worth 
trading  for  the  much  smaller  increases  in  power  against  the 
symmetrical  distributions.  It  must  be  remembered  that  HA 
(the  alternative  hypothesis)  is  that  the  sample  is  not  from 
a  normal  population.  If  he  has  no  knowledge  of  the  popula¬ 
tion  from  which  the  sample  is  drawn,  the  analyst  could 
sacrifice  substantial  power  by  using  these  modified  statistics. 

Finally,  the  power  study  tables  have  been  integrated 
into  Table  XVI.  The  statistic  which  had  the  highest  power, 
for  a  given  sample  size  and  a- level,  have  been  listed  oppo¬ 
site  the  alternative  distribution  for  which  the  power  was 
calculated.  For  instance,  for  the  logistic  distribution  at 
a  =  .20  and  n  =  40 ,  the  most  powerful  statistic  of  the  six 
is  the  Anderson-Darling,  calculated  without  reflecting  the 
sample.  Throughout  Table  XVI,  an  "S"  in  parentheses  indicates 
straightforward  (unmodified)  calculation  of  the  statistic. 

An  ”R”  in  parentheses  indicates  calculation  of  the  statistic 
after  reflection.  The  non-symmetrical  distributions  tested 
are  not  included  in  the  table  because,  for  all  sample  sizes 
and  all  a-levels,  the  unmodified  Anderson-Darling  statistic 
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TABLE  XVI 


The  Statistics  with  Highest  Power  When  Critical 
Values  for  the  Normal  Are  Tested  Using  Various 
Symmetrical  Alternative  Distributions 


Distribution 

Tested 

n 

.20 

.15 

a- level 

.10 

.05 

.01 

Uniform 

10 

A-D(R) 

A-D(R) 

A-D(R) 

A-D(R) 

A-D(R) 

25 

tt 

II 

ft 

tl 

tl 

40 

tt 

II 

ft 

It 

tf 

60 

tl 

tl 

It 

ft 

ft 

Logistic 

10 

A-D(S) 

A-D(S) 

A-D(S) 

A-D(S) 

A-D(S) 

25 

II 

ft 

If 

tl 

It 

40 

tf 

tl 

tl 

It 

tf 

60 

A-D(R) 

A-D(R) 

A-D(R) 

A-D(R) 

A-D(R) 

Student's  t 

10 

A-D(S) 

A-D(S) 

A-D(S) 

A-D(S) 

A-D(S) 

(3  d.f) 

25 

II 

tt 

tl 

tt 

It 

40 

A-D(R) 

A-D(R) 

A-D(R) 

A-D(R) 

A-D(R) 

60 

II 

tf 

II 

It 

II 

Cauchy 

10 

A-D(S) 

A-D(S) 

A-D(S) 

A-D(S) 

CVM(S) 

25 

ft 

ft 

tl 

It 

A-Dfs) 

40 

CVM(R) 

A-D(R) 

A-D(R) 

A-D(R) 

A-D(R) 

60 

A'-UTKT 

It 

ft 

ft 

tt 

Double 

10 

A-D(S) 

A-D(S) 

CVM(S) 

A-D(S) 

A-D(S) 

Exponential 

25 

ft 

ft 

A-b(S) 

II 

C  VM  (  R) 

40 

A-D(R) 

A-D(R) 

A-D(R) 

A-D(R) 

CVMf  R) 

60 

CVM(R) 

It 

tt 

ft 

A-D(R) 

is  the  most  powerful  for  these  distributions. 


It  should  be  noted  that  the  evident  predominance  of 
the  Anderson-Darling  statistic  was  the  basis  for  not  genera¬ 
ting  a  critical  value  table  for  the  modified  Cramer-von  Mises 
statistic. 

Ancillary  Research  Issues.  The  author's  conclusions 
about  the  other  issues  tested  are  made  apparent  in  the  deci¬ 
sions  discussed  in  Chapter  IV.  As  far  as  determination  of 
the  "best"  plotting  position  to  use  with  the  bootstrap  tech¬ 
nique  is  concerned,  the  conclusion  is  that  when  large  numbers 
of  statistics  are  to  be  plotted,  it  makes  no  difference  which 
of  the  three  plotting  positions  is  used. 

The  conclusion  that  5000  (versus  150,  300,  500,  and 
1000)  samples  was  the  number  of  samples  to  use  to  generate 
critical  values  is  sufficiently  explained  in  Chapter  IV. 

Recommendations  for  Further  Research 

The  power  study  done  for  this  thesis  is  extensive 
and  the  conclusions,  thus,  are  based  on  rather  thorough  re¬ 
search.  The  author  sees  no  apparent  reason  to  make  further 
studies  of  this  new  technique  with  the  normal  distribution. 

However,  the  results  of  the  power  study  when  the 
continuous  uniform  distribution  is  used  are  interesting.  The 
power  increase  that  results  is  quite  substantial.  The  powers 
demonstrated  are  better  than  for  any  of  the  statistics  tested 
by  Green  and  Hegazy  (1976).  Perhaps,  if  the  technique  of 
reflection  is  applied  to  the  same  statistics  to  generate 
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critical  values  for  the  continuous  uniform,  the  resultant 
powers  for  the  uniform  might  be  very  high.  This  might,  at 
least,  be  the  case  when  samples  from  symmetrical  distribu¬ 
tions  are  tested. 

The  only  other  suggestion  concerns  the  number  of 
samples  to  use  with  the  bootstrap  technique.  The  decision 
to  use  5000  samples  rather  than  investigate  alternative 
numbers  between  1000  and  5000  samples  was  one  of  expedience. 
Before  the  bootstrap  technique  is  again  used  to  find  critical 
values,  numbers  of  samples  greater  than  1000  and  less  than 
5000  should  be  examine  +cr  consistency  at  the  a  =  .01  level 
of  significance.  Some  savings  of  computer  resource  may  still 
be  possible. 
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APPENDIX  A 


COMPAR 

The  computer  code  for  comparing  the  median  rank,  the 
modified  step  rank,  and  the  average  of  the  mean  and  mode 
ranks  as  plotting  positions  is  included  in  the  following 
three  pages. 
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APPENDIX  B 


Results  of  COMPAR 

The  results  of  program  COMPAR  are  included  in  the 
following  six  pages.  These  particular  results  are  for  when 
there  are  150  points  to  be  plotted. 
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Subroutines 


All  subroutines  written  and  used  by  the  auti.or  are 
included  in  this  section.  The  purpose  of  each  subroutine 
is  discussed  in  Chapter  III. 


********************************************************** 

SUBROUTINE  OVaLS (YLOWER ,  YUPPER , DLOWER , DU PIER ,YVaLUE ,  LOUT ) 
REaL  H,B  /YLQ'.v'EK  ,DjjO'.vbR  ,  YUPPER  ,LUPrER  ,YVaLUE  ,  LOUT 
M  =  (YUPPER  -  YLOXER)/ (BUPPER  -  BLOWER) 

E  =  YLOWER  -  (K  *  BLOWER ) 

BOUT  =  (YVaLUE  -  B)  /  H 
END 

******  *************  *  ***********  *********  *  ****************  * 


SUBROUTINE  LILDIF  (N,?,DIF) 

REaL  F(*),DIF(*) 

INTEGER  I ,N 
BO  100  I  =  1 fN 

DIF(I)  =  F(I)  - ( REAL ( I ) /REaL (  N ) ) 
DI?(I+N)  =  ?(I)  -  ((REaE(I)  -  1 . 0 )  /  L  ) 
100  CONTINUE 


********************************************************** 

SUBROUT I NE  ESTaT ( X ,BIFF, XEIF ) 

INTEGER  I ,  K ,  K 1 

REAL  XEI? , BIFF (*) 

Ml  =  2  *  M 
XEI?  =0.0 

do  ioo  i  =  i  ,i:i 

EI??(I)  =  aBS(DI??(I)) 

IF  (EIFF(I)  .GE.  XEIF)  XEIF  =  DIFF(I) 

ICO  CONTINUE 
END 

********************************************************** 
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**•»■»•»#**#**•»*****■»  **********  ******■>»•*********************** 

SUBROUTINE  aNDaR(N  ,U ,  WS^U.-R  ,Cv/UXT  1  ,  CcUNT 2 ) 

INTEGER  X  ,  COUNT  1  ,  C^UNT2  f  J 

R^ajj  U(*;  f  3  UR  f  aiNU  J  |  -u->  uN  J  1  |  >v  f  U  .» J  1  t  U  J 

COURT  1  =  0 
COURT 2  =  0 
SUK  =0.0 
DO  100  J  *  1,N 
UJ  =  U(J) 

IF  (UJ  .IE.  0.0)  THEN 
UJ  =  .0001 
COURT 1  =  COUNT  1  +  1 

ENDI? 

UN J 1  a  1.0  -  U (N-J+1 ) 

IF  (UNJ1  .LE.  0.0)  THEN 
UNJ1  =  .0001 
C0UNT2  =  COUNT 2  +  1 

END  IF 

LNUJ  =  LOG(UJ) 

LNUNJ1  =  LOG ( UNJ 1 ) 

SUK  =  (  (  (  2 .  G*n£aL  ( J  j  )  —  1 . 0 )  *  (  jjNL  J+LNUN  J 1 )  )  +  SUM 
ICO  CONTINUE 

WSQUaR  =  0.0  -  REaL(N)  -  ((1.0/K_.aL(N))  *  SUM) 

END 

***********  *  *  **********************  *  ***********  TT  *********  * 

SUERCU  T I RE  C VM (N,U , W S  QUaR ) 

INTEGER  J,N 

REaL  U ( * ) , SUM , W S  QUaR , VaLUL 

SUM  =0.0 
DO  100  J  =  1  ,N 

VaLUE  =  (  (2.0*REaL(J)  )  -  1  .C)/(2.0*REaL(N)) 

SUM  =  SUK  +  <(U(J)  -  VaLUE)  *  (U(J)  -  VaLUE)) 

100  CONTINUE 

WSQUaR  =  (1 .G/(12.0*REaL(N)))  +  SUM 


********************************************************** 

SUEROUT I NE  DUBSAM ( X , N ) 

INTEGER  I  ,N 

REaL  X(*),XEaR 

XBaR  =0.0 
DO  100  1=1  ,N 

XBaR  =  XBaR  +  X(I) 

100  CONTINUE 

XBAR  =  XBaR/N 
DO  200  1=1 fN 

X(N+I)  =  (2.0  *  XBaR)  -  X(I) 

200  CONTINUE 
END 

********************************************************** 
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*******  IT*****  ****  *+■***  *■*■**  ■**■**  ■*•*■**  **lt  »■*****•*  ■*■«■•*■*•»******•*** 


SUBROUTINE  ESTPaR(X,N) 

I.iTEGiR  I.N 

REaI  X  ( * )  , XSUrl ,  XEaR  ,  S ,  NERaTR 

XSUK  =0.0 
NMRATR  =  0.0 
DC  100  I  =  1  ,N 

XSUK  =  XSUK  +  X(I) 

100  CONTINUE 

X5AR  =  XSUK/N 
DO  200  I  =  1  ,N 

NMRATR  =  NMRaTR  +  ( (X(I)-X3aR)*(X(I)-XBaR) ) 

200  CONTINUE 

S  =  SQRT (NKRj-.TR/ (K-1 )  ) 

DO  300  1  =  1  ,n 

X(I)  =  (X(I)  -  XLaR)/S 
300  CONTINUE 

Ex© 

********************************************************** 


SUBROUTINE  X?0LaT(N,D) 

INTEGER  N , NMIN 1 , NR BUS 1 

REaL  Y 1 ,Y2 ,D(0 : *) ,LLOWER,BUPPER,XC 

Y 1  =  0.5/N 

Y  2  =  1.5/N 

BLOWER  =  E(1j 

DUPPER  =  D(2; 

CaLL  OVaLS (Y1  , Y2 ,  BLOWI.R ,  DU PPER ,  C .  0 ,  XO  ) 

I?  (XO  .GE.  0.0)  THEN 
D(0)  =  XO 
ELSE 

D(0)  =  0.0 
ZMJIP 

Y 1  =  (REAL(N)  -  1.5)/N 
Y2  =  ( REAL ( N )  -  0.5)/N 
IN'. IN  1  =  K  -  1 
BLOWER  =  D(NKINI) 

BUPPBR  =  D(N) 

Call  CVaLS (  Y 1  ,  Y  2  , BLOWER , DU  PI ER , 1 . 0 , XO ) 

NPLU51  =  N  +  1 
B(NPLUSI)  =  XO 

END 

********************************************************** 
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WWWLM.W 9 1  “ » 1 1 il-M!  W  WJilMUiUU 


1 


h 


i 


*  j 

i 


★  it  -i 


100 


SL_~riCT  X'  I*.E  CVrtiiliE  ( i>  ,  C  y  .yixiSO  ,  C  '/njj8  5  *  ^  <  ^j-i90 ,0  v  5  t  w  v  .t-L  >9  » •.< ; 

Xi  JL j irv  X  , .,r j-jO  1 

REaL  E(0:*)  ,Y (0: 6000)  ,CC.-.x60 , CuMPSO ,C«-<MP95  ,Cv«.r^9» 

+  Y79,L79,Y81 ,DS1 ,LIF90 ,Y69 ,Y 91 ,L89,E91 fLIF95,LIFeO 

+  Y94,Y96,L94,D96,DI?99,Y96,Y100,D96,L1C0,IX?85,C0.-.?85, 

+  Y84  ,L84  ,  Y86  ,1)86  ,C  v.«.L&5  ,C7«.L6G  ,  CVaLSO  ,  CVaL;)5  ,  CVaLSS 

DO  100  I  =  1,N 

Y(I)  =  ( REaL ( I ;  -  0.5; /  MEaL ( N ) 

CONTINUE 
Y (0)  =  0.0 
Nr  LSI  =  N  +  1 
Y (NX LSI )  =  1.0 
COHPSO  =  10G0.0 


C0MP85  =  1000.0 
C  CMP 90  =  1000.0 
COEP95  =  1000.0 
COMP 9 9  =  1000.0 
DO  200  I  =  NPLS 1 ,0,-1 

I?  (  Y  ( I )  .IE.  0.75)  GOTO  300 
I?  (Y(I)  .GT.  0.75  .AND.  Y(I;  .LE.  0.80)  THEN 
DIF80  =  .80  -  Y(I) 

IF  (EIF80  .LE.  COMP 80)  THEN 
CGKP8C  =  EIF80 
Y79  =  Y (I) 

E79  =  L(Ij 
Y81  =  Y (1+ 1 ) 

D81  =  D(I+ 1 ) 

ENLI? 


ELSEIF  (Y (I)  .GT .  .80  . aND .  Y(I)  .LE. 
EIFS5  =  .85  -  Y  ( I ) 

IF  (EIF85  .LE.  CG.-.P65;  THEN 
COMPS 5  =  LIFS5 
Y84  =  Y  ( I ) 

L84  =  D(l; 

Y86  =  Y( 1+ 1 ) 

D66  =  D(I+1; 

Ei.EIF 

ELSEIF  (Y(I)  .urT.  .85  .aND.  Y(l)  .LE. 
LIF90  =  .90  -  Y(I) 

IF  (LIF90  .LE.  COMP 90)  THEN 
CCMP90  =  DIF90 
Yc9  =  Y (I ; 

L 89  =  L(I) 

Y91  =  Y  ( I  + 1 ) 

LSI  =  L(I+l ; 

TT  r,  ^  TZ? 


.85)  THEN 


.90) 


E.Lo.iIF  (-(I)  •  >j •  *50  ..i-iD.  i  (  I ;  «  95  )  I"-*' 

LIF95  =  .95  -  'X  ( I ) 

IF  (LIE 9 5  .  LE .  CC»-.i?95)  r-nL.s 
C^v.Iy5  =  DIF95 
Y34  =  Y(I) 

Y96  =  Y(l+1) 

D94  =  L(I) 

L96  =  E(I+1) 

Er.LIF 

ELSEIF  (3(1)  . GY .  .95  .aLD.  Y(I)  .LE.  .99;  TEEL 
DIF99  =  .99  -  Y ( I ) 

IF  (LIF99  .IF.  CC;-.F99)  TEEL 
COKP99  =  DIF99 
Y98  =  Y ( I ) 

Y 100  =  Y ( 1+ 1 ) 

E98  =  D(I) 

D100  =  D(I+1) 

ZLDIF 

El. DIF 

2GG  C-..TI-.US 
3CC  CCEYILUE 

IF  (DIF 80  .LQ.  0.0)  TEEL 
CVA160  =  L79 

Call  CVaLS(Y79,Y81 ,L79,D81 ,.8C,CVaL80) 

EaL  IF 

IF  (DIFS5  .EQ.  0.0)  TEEK 
CVaL85  =  DS4 
ELSE 

CaLL  CVaLS(Y84 ,Y66 ,D64 ,D86 , . 85 ,CVaL85) 

ELDIF 

IF  (DIF9C  .EQ.  0.0)  TEEL 
CVAL90  =  D89 

CaLL  OVaLS (Y 89  »Y91  ,D89  , D9 1  > • 90 , CVa.u9C ; 

SLLIF 

IF  (DIF95  .EQ.  0.0)  TEEL 
CVaL95  =  D94 
ELSE 

CaLL  CVaLS(Y94  ,Y96  ,D94  ,D96  , .  95  ,C'vaL95) 

Ei.DIF 

IF  (DIF99  .EQ.  0.0)  TEEL 
CVaL99  =  D98 
ELSE 

CaLL  CVaLS (Y9S ,Y 1GC , D98 ,D1 00 , . 39 , CVaL99) 

ELL  IF 


AiriEnLIX  D 


COt'iLIL 


This  program  was  used  to  determine  the  number  of 
samples  to  use  for  the  bootstrap  tecnnique  and  to  validate 
the  logic  used  to  find  critical  values  of  tne  unmodified 
holmogorov-Smirnov  statistic. 


FRQGRaM  CO.-  Ill 

I. .TIGER  SaKSIZ  ,  J  ,N  ,K ,  I ,  Si-  xjA31  ,  a  ,K 

REaL  R(120) ,LIFFS(240) , nl^ST ,1 ,r ,1STaTS(G : 5004; , 

+  CV8G,CV85,CV95,CV99 

DOUBLE  RREOISIOi,  SZED1 
3EEL1  =  21478. DO 
DC  400  a  =  1  , 5 

IF  (a  .EQ.  1)  SaKSIZ  =  153 

I?  (A  .EQ.  2)  SaKSIZ  =  505 

IF  (a  .EQ.  3)  SaKSIZ  =  503 

IF  (a  .IQ.  4}  SaKSIZ  =  1003 

IF  (a  .EQ.  5)  SaKSIZ  =  5003 

DO  500  N  =  10,30,10 

PR  I  AT  *,  'N  =  ’  ,  14 ,  *  a.,I  SaKSIZ  =  SaKSIZ 
0  100  <J  =  1  ,Saiv:oIZ 

CaLxj  u-uKC  (  OnED  1,1,u,a,R) 

K  =  N 

CaLL  SSTEaR(R ,k) 

Call  VSRTa(R,K) 

DO  2u0  K  =  1,K 
Y  =  R(k) 

CALL  I'.d:.or(y  ,i) 

R(K)  =  F 

200  CONTI DUE 

CaLL  LI1DIF(i\,R,DIF?S) 

CaLL  DSTaT(K, LIFTS ,LILST) 

DSTaTS(J)  =  LUST 
100  CoNlIKUE 

LSTaTS(O)  =0.0 

SMPLS1  =  SaKSIZ  +  1 

CaLL  VSRTa(LSTaTS,SKILS1) 

Ca^L  X10LaT(SaRSIZ,DSTaTS) 

Call  CVaLUE  (BSTaTS , C780 , CV85 , CV90 , CV95 , CV99 , SaKSIZ ) 
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A0-A115  496 
UNCLASSIFIED 


AIR  FORCE  INST  OF  TECH  WR I OHT -PATTERSON  AFB  OH  SCHOO-- ETC  F/S  12/1 
A  NEW  OOODNESS  OF  FIT  TEST  FOR  NORMALITY  WITH  MEAN  AND  VARIANCE— ETCIUI 
DEC  A1  T  J  REAM 

AFIT/00R/MA/S1D-9  NL 


5C0 

4CC 


PRINT 

■ft 

9 

'?OR  SaKSIZ,'  L 

PRINT 

*• 

•CVaLSO  = 

CV80 

ERI.iT 

*» 

'  GViiL65  =  '  , 

CV85 

PRINT 

* 

9 

'  CV«,LSO  = 

CV90 

PRINT 

*» 

•CVjJ.95  = 

CV95 

PRINT 

PRINT 

PRINT 

* 

9 

* 

* 

•CVAL99  =  \ 

CV99 

CONTINUE 

CONTINUE 

END 
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APPENDIX  E 


T^ELE2 


This  program  is  typical  of  the  programs  used  to 
obtain  critical  values  of  the  statistics.  This  particular 
example  is  used  to  find  critical  values  for  the  Anderson- 
Darling  statistic  at  sample  size,  n  =  40. 

********  ****#*■#****•#  •*■*•***•**■■**•■**■■****•*•**■*••*■■***•*  *■**■#•**■*•*■*•*•*•»►*•* 


PROGRAM  TaBLE2 

INTEGER  SaMSIZ  ,  J  , i, , K , SMPLS 1  ,R,CNT1  ,CNT2, 

+  COUNT  1 ,C0UKT2 

REaL  R( 1 20) ,Y,P,WSQUaR( 0:5004) ,WSQRD, 

+  CV8C , CV85 ,CV90,CV95,CV99 

DOUBLE  PRECISION  SEED1 
3EED1  =  469857936. DO 
COURT 1  =  0 
COUNT 2  =  0 
SaMSIZ  =  5000 
N  =  40 

PRINT  *,  *N  =  1  ,N,'  AND  SaMSIZ  =  »,SaKSIZ 
DO  100  J  =  1, SaMSIZ 

CaLL  GGn0(SiiD1 , 1 , N , a , R  J 
CaLL  DUESaM ( R , h ) 
h  *  2  *  N 
CALL  EST?a.R(R,M) 

CaLL  VSRTa(R,Iv1) 

DO  200  K  =  1,M 
Y  =  R(K) 

CaxiL  MDN OR  ( Y* ,  r ) 

R(K)  =  P 

200  CONTINUE 

CaLL  aNDaR (M ,R , W3QRD , CRT 1 , CNT2 ) 
WSQUaR(J)  =  WSQRD 
COUNT  1  =  COUaTI  +  CNT 1 
C0UNT2  =  COO NT 2  +  CNT 2 
100  CONTINUE 
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CAIUj 

PRINT* ,  ' 


m'SQUaR(O)  =  O.0 
SIPLS1  =  SaKSIZ  +  1 
CALL  V SRT  a  ( WS QU aR  ,  Si-.PLS  1 ) 
Caxijj  Xr  GLaT  (  Sa  i'.S  X  Z  t  ft  S  GO  a  a  ) 


LL  CVaLUR(  WSQUaR  ,CV80  ,CV85  ,CV5G  ,CV95  f  CV99  jSaRSIZ  ji 
,  '  FOR  •  ,  SaI\SIZ  ,  1  AiiLz-RSOL— LaRjjXAG  STaTISTiCS  aT  N  = '  » N 
PRINT  *,  »CVaL8G  =  GV80 


PRINT 

PRINT 

PRI.-.T 

PRINT 

PRINT 

PRINT 

PRINT 


' CVAL85 
1 CV  aL9G 
•CVaL95 
'CVaL99 

• count i 

'COUNT 2 


=  ' , CV65 
=  •  ,  CY  90 
=  * ,  CV95 
=  GV99 

=  ' , COUNT 1 
=  '  ,CCUi'iT2 


APPENDIX  F 


POWERS 


This  program  is  typical  of  tnose  used  in  the  power 
study.  This  particular  one  is  used  to  find  powers  for  all 
six  statistics  when  tested  against  5000  samples  of  size, 

n  =  10,  from  the  Cauchy  distribution. 

*****************************#************************-**** 

PROGRj  K  POWERS 


INTEGER 

REaL 


NR , J , K , L , K , COUNT ( 4 ) ,P0WER(30) ,CNT1 ,CNT2,: 
WK(360j ,R( 1 20) ,S(120) , T ( 1 20 ) ,1IF?S(240) , 


+ 

+ 


600 


200 


300 


400 


DCUBLfc. 

SSED1  = 

DO  600  I  = 
POWER ( I ) 
CONTINUE 
DO  800  I  = 
COUNT (I) 


Y ,P, LILIES, LI-IE2 ,aNBaR1 ,aNDaR2 ,GRVK, 
CRVK2 , PWR ( 30 ) 

PRECISION  SEED1 
1095785. DO 
1,30 
=  0 


1.4 
=  0 


800  CONTINU¬ 


ER 

DO 


=  10 

100  J  =  1,5000 
CaLL  GGCAY ( SEED 1 , NR , WE ,R) 

CaIL  VSRTA(R,NR)  ' 

DO  200  K  =  1  ,NR 

tIk]  =  r[k! 

CONTINUE 

CALL  ESTPaR(S,NR) 

CaLL  dubsam ( T , NR ) 

M  =  2  *  NR 
call  estpar(t,k: 

CaLL  V SRTa ( S , NR ' 

call  vsrta(t,n) 

DO  300  L  =  1 ,K 

Y  =  T(L) 

call  kb.,or(y,p) 

T(L)  =  P 
CONTINUE 

DO  400  L  =  1 ,NR 

Y  =  S(L) 

call  kdnor(y,p) 

S(L)  =  P 
CONTINUE 

CaLL  LILDIF(NR,S,LI?FS) 

Call  DSTaT (NR, DIF y-S, lILIES) 

CaLL  aNlaR(KR,S,aNLaR1 ,CNT1 ,ClT2) 


69 


ccunt(i)  =  ccui.TC  i )  +  cnti 

COUNT  (2)  =  COUNT(2)  +  CNT  2 
Call  CVM(NR,S,CRVh) 

Call  lilbi?(r,t, biffs) 

Call  LSTAl(i‘-t LIFFS  ,LILIE2) 

CAijL  AaBaRCa ,  T , aLBaR2  f CNT 1  >  Ji'iT2 
COUNT (3)  =  COUNT (3)  +  CNI1 
COUNT (4)  =  COUNT(4)  +  CNT 2 
CALL  CVK ( K , T , CRVM2 ) 

IF  (LILIES  .GT.  .21595)  THEN 
POWER ( 1 )  =  POWER(I)  +  1 

ENBIF 

IF  (LILIES  .GT.  .22547)  THEN 
POWSR(2)  =  PCV._R(2)  +  1 

3NDIF 

IF  (LILIES  .GT.  .23857)  THEN 
POWZR(3)  =  FOWER(3)  +  1 

LNBIF 

IF  (LILIES  .GT.  .25841)  THEN 
POWER  (4)  =  r  QWER  ( 4 )  +  1 

ENBIF 

IF  (LILIES  .GT.  .20564)  THEN 
POWZR( 5)  =  PCWER(5)  +  1 

ENBIF 

IF  (  LILIE2  .GT.  . 13452)THEN 
P0WER(6)  =  P0WSR(6)  +  1 

ENBIF 

IF  (LILIE2  .GT.  .14278)  THEN 
POWER (7)  =  rOWER(7)  +  1 

ENBIF 

IF  (LILIE2  .GT.  .15309)  THEN 
POWER (8)  =  POWER (8)  +  1 

ENBIF 

IF  (LILIE2  .GT.  .16858)  THEN 
P0WER(9)  =  POWER (9)  +  1 

Z1\LIF 

I?"(LILIE2  .GT.  .20295)  THEN 
POWER(IC)  =  POWER (10)  +  1 

ENBIF 

IF  (aNBaRI  .GT.  .46452)  THEN 

POWER(II)  =  POWER ( 11)  +  1 

ENBIF 

IF  (ANBaRI  .GT.  .51170)  THEN 

POWER (12)  =  POWER (12)  +  1 

ENBIF 

IF  (aNBaRI  .GT.  .58377)  THEN 
POWER ( 13)  =  POWER ( 13)  +  1 

ENBIF 

IF  (aaBaRI  .GT.  .66950)  THEN 
P0WBR(14)  =  POWER (14)  +  1 

ENBIF 
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I?  (aNLaRI  .  GT.  .90666;  THEN 

POWER (15)  =  rOWSR(l5)  +  1 

ENDIF 

I?  (.-.NL«R2  .GT.  .44203)  THEN 
PQWER( 16 )  =  POWER ( 16)  +  1 

ENLIF 

IF  (aNDaR2  .GT.  .50275;  THEN 
POWER (17)  =  POWER (17)  +  1 

enlif 

IF  (ANDaR2  .GT.  .57780)  THEN 
POWER( 1 8)  =  POWER ( 18)  +  1 

ENLIF 

IF  (ANLAR2  .GT.  .71245)  THEN 
POWER (19)  =  POWER (19)  +  1 

ENLIF 

IF  (ANLaR2  .GT.  1.05927)  THEN 
POWER (20)  =  POWER (20)  +  1 

ENLIF 

IF  ( CRVH  .GT.  .067204)  THEN 

P0WER(22)  =  POWER (22)  +  1 

E..LIF 

IF  (CRYI-i  .GT.  .076210)  THEN 

POWER (21)  =  POWER (21)  +  1 

ENLIF 

IF  ( CRVM  .GT.  .100425)  THEN 
POWER (23)  =  POWER (23)  +  1 

END  IF 

IF  (CRVIW  .GT.  .120563)  THEN 

POWER (24)  =  POWER (24)  +  1 

iua-uIF 

IF  (CRVK  .GT.  .170314)  THEN 

POWER (25)  =  POWER (25)  +  1 

ENLIF 

I?(CRVK2  .GT.  .071429)  THEN 

r0WEn(26)  =  POWER (26)  +  1 

ENLIF 

IF  (CRVH 2  .GT.  .081378)  THEN 
POWER (27)  =  POWER (27)  +  1 

ENDI7 

IF  (CRVK2.GT.  .096362)  THEN 
POWER (28)  =  POWER (28)  +  1 

ENLIF 

IF  (CRVK2  .GT.  .124018)  THEN 
POWER (29)  =  POWER (29)  +  1 

ENLIF 

I?  (CRVK 2  .GT.  .181560)  THE., 

POWER (30)  =  POWER (30)  +  1 

2.,  LI? 

CONTI. .US 
PRINT  '(a)',  *1' 

PRINT  * 

PRINT  * 


91 


PRI..T  *,'aGaI^ST  THE  oaUCHY  DISTR.IBUTICN ' 

-KII.T  * ,  'THE  REJECTIONS  aT  h  =  '  , NR , '  aRE  aS  FO^OWo: 
rRI.nT  * 

PRI..T  '  (T7,a,T17,a,T27,a,T37,a,T47,a)  •  ,  ' . 6C  '  ,  ' . 85  ' 

+  '.yC  , '.95', '.99' 

PRIi.T  *  'FOR  ^ILIEFCRS:  ' 

PRINT  '  (T6,I4,T16,I4,T26,I4,T36,I4,T46,I4)  '  , 

+  ( POWER (I) ,1=1 ,3) 

PRIi.T  *,  'FOR  LILIEFCRS  DOUBLED:  ' 

PRINT  ' (T6,I4,T16,I4,T26,I4,T36,I4,T46,I4) ' , 

+  (POWBR(I) ,1=6,10) 

PRIimT  *,  '  FOR  a^DERSON-DaRLING  : 

PRIKT  ' (T6, 14, T16, 14, T26, 14, 236,14,146,14) ' , 

+  (POWER (I) ,1=11 ,15} 

I  HINT  *.  'FOR  aNDERSO  N  -DaRxiI  NG  DOUBLED:  * 

PRINT  ' (T6, 14,216, 14, T26, 14, 236, 14, 246,14) ' , 

+  (POW..R(I)  ,1=16,20) 

PRINT  *.  'FOR  CRaHSR-VON  talSES:  ' 

PRINT  ' (26, 14, 216,14, 226,14, 236,14, 246,14) ' , 

+  (POW.-Jl(I)  ,1=21 ,23) 

PRINT  *,  'FOR  C RARER -VON  RISES  DOUBLED:  ' 

PRINT  '(T6,I4,T16,I4,T26,I4,T36,I4,T46,I4) ' , 

+  (PO’«_R(I)  ,1=26 , 30) 

PRINT  * 

PRINT  * 

PRINT  * 

DO  500  I  =  1,30 

PWR(I)  =  FOWER(I) / 5000.0 
5C0  CONTINUE 

PRINT  *,  'THE  POWERS  aT  N  =  ’ , NR , '  aRE  aS  FOLLOWS:  ' 
PRIi.T  *  'FOR  LILIEFCRS:  ' 

PRINT  ' (T5,F6.4fT15,?6.4,T25,F6.4,T35,?6.4,T45,?6.4) ' 
+  (PWR(I) ,1=1,5) 

PRINT  *,  'FOR  LILIEFCRS  DOUBLED:  ' 

PRINT  '(T5,F6.4,T15,?6.4,T25,F6.4,T35,F6.4,T45,F6.4) ' 
+  (PWR(I) ,1=6,10) 

PRINT  *,  'FOR  aNDERSON-DaR^ING:  ' 

PRINT  ' (T5,F6.4,T15,F6.4,T25,F6.4,T35,F6.4,T45,F6.4) ' 
+  (PWR(I), 1=1 1,15) 

PR.IHT  * ,  'FOR  aNDERSON-DaRLING  DOUBLED:  ' 

PRINT  '(T5,F6.4,T15,F6.4,T25,F6.4,T35,F6.4,T45,F6.4) ' 
+  (PWR(I) ,1=16,20) 

PRINT  *,  'FOR  CRaKER-VOK  MISES:  ' 

PRINT  ’(T5,F6.4,T15,F6.4,  T25 ,F6 . 4 ,235 ,F6 . 4 ,T45 ,F6 . 4) 
+  (PWR(I) ,1=21 ,25) 

PRINT  *  'FOR  CRaKBR-VCN  RISES  DOUBLED:  ' 

PRINT  ' (T5 ,F6 . 4 »T 1 5 »F6 . 4 »T25 ,?b . 4 ,T55 >F6 . 4 ,T45 ,F6 . 4) ' 
+  (PWR(I) ,1=26 , 30) 

PRINT  * 

PRINT  * 
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PRIx.T 

* 

9 

' COUNT 1  = 

PRINT 

* 

9 

•CoUNT2  = 

PRINT 

* 

9 

•CCUI.T3  = 

PRINT 

* 

f 

'COUNT 4  = 

END 


*  ,CCUNT(1) 
• .CoUNT (2) 
' ,CwUNT(5J 
* .COUNT (4; 
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